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Abstract 
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Executive Summary 


Ensuring effective protection to a dataset in terms of guaranteeing proper anonymization to its 
informative content is a complex problem. Whatever the specific disclosure risk that should be 
counteracted, the scientific community has devoted effort to come up with effective protection 
techniques throughout the past. These techniques led to two interpretations of privacy: syntactic 
and semantic. While syntactic privacy demands that, for example, each release of data must be 
indistinguishably related to no less than a certain number of individuals in the population, seman- 
tic privacy aims at satisfying a property of the mechanism chosen for releasing the data. We will 
discuss a line of approaches for both concepts. For syntactic privacy, k-anonymity, ¢-diversity 
and t-closeness will be outlined. For semantic privacy, e-differential privacy, (€, 6)-differential 
privacy and €-local differential privacy will be outlined. We note that syntactic privacy algorithms 
have interpretable privacy guarantees and have seen wide adoption for anonymized microdata re- 
lease. However, they consider specific aspects of the problem and hence can remain vulnerable 
to possible attacks, and can have limited applicability to high-dimensional data. Algorithms for 
semantic privacy have seen wide adoption for perturbation of statistical functions. Their mathe- 
matically strict privacy guarantees are, however, comparatively hard to interpret. Recent studies 
have pointed out that both approaches are reasonable, successfully applicable to different scenar- 
ios, and there is room for both of them, possibly jointly adopted. 

We pay special attention to privacy in the context of machine learning, due to the ubiquity 
of machine learning in modern software applications, which require massive amounts of train- 
ing data. We consider machine learning architectures for classification and generative models, 
and quantify privacy in the context of machine learning via a threat model based on membership 
inference attacks (aiming to identify who might be contained in the training data). 

We see several promising research directions: on the one hand, supporting data owners in 
choosing privacy parameters. For example, by providing translating and interpreting abstract pri- 
vacy parameters such as € into concrete risk probabilities for identity disclosure. On the other 
hand, mitigating liability risks for data analysts when handling sensitive or personal information 
by the complementary use of encryption for information security and anonymization for data pri- 
vacy. 


1. Introduction 


The online collection of big data, such as people’s behaviors, is becoming a major driver of the 
digital economy. Data and its analysis form the resources of tomorrow’s economy as businesses 
that have collected such massive amounts of data are actively looking for ways of monetizing it. 
However, while there are great economic opportunities there are also societal risks. Big data col- 
lection and analysis may allow sensitive inferences about people’s life. For example, genomic or 
health data which are major drivers of big data may allow inferences about disposition to certain 
illnesses or personality traits. Future employers could leverage that information to deny access 
to certain career paths. Even shopping data may reveal such sensitive health-care related infor- 
mation as the case of Target’s advertising shows [Hil12]. Hence, it is a societal challenge to 
balance these objectives of spurring economic growth and preserving personal privacy. Solutions 
include a variety of approaches from self-controlling and privacy-respecting behavior, to legal reg- 
ulation and technical protection means. No single solution can work by itself and any technical 
approach needs to integrate into the legal framework. In particular, the EU data protection reg- 
ulation includes the categories of personal, pseudonymized and anonymized data. Personal (and 
pseudonymized) data may only be used for the purpose it has been collected for and if such data 
are used for other purposes - which may relate to monetization - the data needs to be anonymized. 
Anonymization means that no re-identification is possible without the original dataset. This proves 
to be a challenging technical task. Especially, since data may have inherent patterns that remain 
over time. Hence a small de-anonymized sample may suffice to re-identify entire anonymized 
datasets. An example attack of this kind is the re-identification of smart meter data [JJRII]. In 
consequence of these challenges almost no data can be left unmodified for proper anonymization. 
The idea of sharing sensitive, unmodified data are thus essentially challenged. The goal of MO- 
SAICrOWN is to provide a set of functionalities for the data owners to apply anonymization with 
measurable and reliable guarantees. As such, the envisioned MOSAICrOWN set of functionalities 
are anonymization methods that provide a privacy parameter for each method that can be appro- 
priately set balancing privacy versus utility. Hence, the MOSAICrOWN project aims to enable 
the sharing of personal or sensitive big data sources, i.e., preserving sufficient utility, while also 
preserving the privacy of the data. 

Furthermore, due to the ubiquity of machine learning (ML) in modern software applications, 
we pay special attention to privacy in the context of ML, which requires large amount of train- 
ing data. The collection of sufficient training data, to satisfy model generalization and provide 
meaningful utility, has proven difficult and resulted, in some cases, in privacy violations (e.g., 
(The17|[HMDD19]). We consider two ML architectures that have been investigated for use in data 
markets: feedforward neural networks for classification and generative models! for data genera- 
tion. For each architecture, we discuss how privacy can be quantified via membership inference 


' Generative models are ML models that are trained to learn the joint probability distribution p(X, Y) of features X 
and labels Y of training data. 
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attacks — which aim to discover who (an individual or group of individuals) is contained in the 
training data — and mitigation techniques for such threats. 

In this document, we will outline the specific technical challenges in anonymization and in- 
terpretations of the privacy parameters. Furthermore, we detail privacy threat models for machine 
learning in the form of membership inference attacks. In the remainder of this section we detail 
state of the art and innovations produced by MOSAICrOWN. In Chapter [2] we formalize the con- 
cept of anonymity (Section [2.1p and related protection techniques (Section P.2). We then discuss 
syntactic and semantic privacy interpretations, and introduce technical means for enforcement 
(e.g., k-anonymity, €-differential privacy) in Chapter [3] In Chapter [4] we quantify privacy in the 
context of machine learning by defining threat models and mitigation techniques. We conclude in 
Section [5] 


1.1 State of the Art and MOSAICrOWN Innovation 


In this section, we summarize the state of the art for privacy metrics and data sanitization and the 
innovation produced by MOSAICrOWN. 


1.1.1 State of the Art 


As studied in the document, many anonymization approaches have been proposed. However, none 
can provide a one-size-fits-all solution for all privacy protection and utility needs. 

Anonymization approaches that follow a syntactic privacy interpretation 
are 
typically enforced via generalization (e.g., replacing precise values with a coarser ones) and sup- 
pression (e.g., removing identifiers). Such approaches preserve data truthfulness (i.e., do not di- 
rectly alter the data) but sacrifice data completeness (due to coarser or removed values). 

Approaches following a semantic privacy interpretation 
typically use perturbation (e.g., additive noise). Such methods di- 
rectly alter the data by modifying its informative content, thus, do not preserve data truthfulness, 
hence, the modification mechanisms have to be carefully fine-tuned to allow statistical inference 
on the modified data. 


1.1.22 MOSAICrOWN Innovation 


The innovation produced by MOSAICrOWN regarding privacy metrics and data sanitization are 
discussed in this section. 


e The first innovation is given in the form of the analysis of the concept of anonymization and 
related protection techniques in Chapter [2] namely non-perturbative techniques, that do not 
alter the data but remove details (e.g., generalization), and perturbative techniques, that alter 
the data (e.g., by adding noise). Based on this analysis, we discuss syntactic and semantic 
privacy interpretations and protection techniques in Chapter B] 


The second innovation consists in the quantification of privacy for machine learning with 
membership inference in Chapter] Machine learning requires large amounts of (sensitive) 
training data and membership inference attacks aim to infer whether an individual, or a set 
of individuals, belong to a training dataset. We present and discuss mitigations and privacy 
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parameter selections with regards to membership inference via semantic techniques during 
machine learning training. 
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2. Sanitization of Data 


Protecting the privacy of the individuals to whom a dataset refers requires sanitization the dataset, 
meaning modifications to ensure the inability to link an individual to their information. A possible 
approach for sanitization is to anonymize the dataset, which has been proven to be a complex task. 
In this chapter, we illustrate the main issue characterizing anonymization of data (Section P.1), 
and present some data protection techniques that can be adopted (Section|2.2). 


2.1 The Anonymity Problem 


The problem of data anonymization has been heavily investigated in the context of microdata re- 
lease, where datasets are represented as relational tables with one record for each individual (called 
respondent), and one column for each attribute related to the respondents (e.g., name, date of birth, 
job, etc.). The first step for protecting the privacy of a dataset requires is removing (e.g., by delet- 
ing or encrypting) any identifying attributes, such as names, e-mail addresses, or unique identifiers 
(such as the social security number). This process, usually referred to as de-identification, is unfor- 
tunately not enough to ensure anonymity to the data. In fact, a de-identified dataset can still include 
other information, called quasi-identifiers (QI), which can be linked to external sources to reduce 
the uncertainty about the identity of some respondents [DFLS12]. Based on a study performed 
on the US 2000 Census, Golle discovered that 63% of the entire US population is uniquely iden- 
tifiable by a combination of their gender, ZIP code, and full date of birth [Gol06]. The following 
example illustrates such re-identification risks. Consider Figure P.I[a), illustrating a de-identified 
dataset containing information for a set of hospitalized patients. Figure P.I[b) illustrates a sample 
excerpt of a fictitious publicly available voter list of a New York City municipality. It is easy to 
see that it is possible to exploit attributes DoB, Sex, and ZIP for linking the two datasets, possi- 
bly re-identifying (with either full confidence or a certain probability) some of the de-identified 
respondents in Figure P.I[a). For instance, the de-identified dataset includes only a female re- 
spondent, born in 1958/12/11 and living in the 10180 area (record 11): if this combination of 
quasi-identifying values is unique in the external world as well, then the voter list can be exploited 
to uniquely re-identify the eleventh record with respondent Kathy Doe, also disclosing the fact that 
she has been hospitalized for epilepsy. Considering that tremendous amounts of data are gener- 
ated and shared every day, the availability of non de-identified datasets that can be used for linking 
is a realistic threat. Famous incidents that gained headlines in the news include the Netflix re- 
identification [NSO8], where a de-identified set of Netflix recommendations has been re-identified 
using the public IMDB recommendations, or the re-identification of credit card data [IMRSP15], 
where cardholders could be re-identified given a few sample purchases. Unfortunately, unlike di- 
rect identifiers, removing QI information might not be a feasible strategy, since QI can represent a 
large portion of the informative content of a dataset. Therefore, its complete removal risks reduc- 
ing data utility (e.g., removing also quasi-identifiers from the de-identified dataset in Figure[2.I{a) 
would leave only a list of diseases, most likely of limited interest to final recipients). 


14 


Section 2.2: Protection Techniques 15 


SSN | Name DoB Sex| ZIP | Disease 
1960/05/02 | F | 10041 | stroke 
1960/05/20 | M | 10032 | dyspepsia 
1960/05/12 | M | 10037 | achlorhydria 
1960/05/05 | F | 10044 | epilepsy 
1955/09/01 | M | 10043 | helicobacter 
1955/09/02 | M | 10042 | helicobacter 
1955/09/10 | F | 10039 | helicobacter 
1955/09/20 | F | 10030 | helicobacter 
1955/12/07 | M | 10030 | dermatitis 
1955/12/05 | M | 10031 | retinitis 
1958/12/11 | F | 10180 | epilepsy 
1955/12/25 | F | 10042 | dermatitis 
1955/12/30 | F | 10045 | gastritis 
1960/04/02 | F | 10036 | stroke 
1960/04/05 | F | 10034 | labyrinthitis 
1960/04/10 | M | 10047 | gastritis 
1960/04/30 | M | 10048 | dyspepsia 

(a) 
Name Address City ZIP DoB Sex | Education 


Kathy Doe | 300 Main St. | New York City | 10180 | 58/12/11 | female | secondary 


(b) 


Figure 2.1: An example of de-identified medical dataset (a) and of publicly available non de- 
identified dataset (b) 


Given a de-identified dataset, two main kinds of improper disclosure can occur [Fed05]. 


e Identity disclosure, occurring whenever the identity of a respondent can be somehow deter- 
mined and associated with a record in the de-identified dataset; 


e Attribute disclosure, occurring when a (sensitive) value can be associated with an individual 
(without necessarily being able to link the value to a specific record). 


There are several factors that can contribute to (or, conversely, reduce) the risks of identity 
and attribute disclosure [Fed05]. For instance, the existence of high-visibility records, assuming 
uncommon values for certain attributes (e.g., a very high income, or a rare disease, or a very 
uncommon job) that can make these records stand out from other ones. Similarly, the more the 
common attributes between the dataset and the external source of information (and the more exter- 
nal sources as well), the higher the disclosure risks. By contrast, the natural noise characterizing 
the dataset and the external sources, the presence of data that might not be completely up-to-date 
or that refer to different temporal intervals, and the use of different formats for representing the 
information in the dataset and in the external sources can contribute to decreasing the disclosure 
risks. 


2.2 Protection Techniques 


Whatever the specific disclosure risk that should be counteracted, the scientific community has de- 
voted major efforts to come up with effective protection techniques [CDES07]. A first distinction 
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Sanitization of Data 


can be made between masking techniques and synthetic data generation techniques: while these 


latter ones aim at producing a new, synthetic dataset that maintains some statistical properties of 


the original data (and can then be safely released or published instead of the original one), the 


former operate directly on the original data, to sanitize them before releasing. Masking techniques 


can be classified based on how they operate on the original data, as follows. 


Non-perturbative techniques do not directly modify the original data, but remove details 
from the dataset. Such techniques preserve data truthfulness, sacrificing data complete- 
ness by producing imprecise and/or incomplete data. Examples of non-perturbative tech- 
niques include sampling, suppression, generalization, global recoding, and bucketization. 
Sampling consists in releasing data that are related to a subset of the original population. 
Protection is provided by the uncertainty about the presence in the dataset of the informa- 
tion about a specific respondent. Suppression selectively removes information from the 
dataset (for instance, direct identifiers are typically suppressed before release, as discussed 
above). Generalization selectively replaces some values in the dataset with more general 
ones: for instance, a complete date of birth can be generalized by releasing only the month 
and year, or the year of birth. A possible way to enforce generalization is based on the 
definition of generalization hierarchies, identifying the possible generalized values that can 
be used to replace more specific ones. Global recoding, which can be seen as a particular 
kind of generalization, partitions the set of values that can be assumed by an attribute into 
disjoint intervals, usually of the same width, and associates a label with each interval. In- 
stead of releasing the original values, the labels of the corresponding intervals are published. 
Two examples of global recoding techniques, specifically designed for numerical attributes, 
are top-coding and bottom-coding. Top-coding replaces all values that are above a certain 
threshold with a given label (e.g., high incomes over 1 million dollars are are replaced by 
label “>1M’’). Bottom-coding substitutes the values under a given threshold with a given 
label (e.g., low incomes less than 50 thousand dollars are replaced by label “<50K’’). Buck- 
etization operates on sets of attributes whose joint visibility should be prevented (e.g., the 
name and the disease of a patient), and operates by first partitioning records in buckets and 
attributes in groups, and then shuffling the partitioned records within buckets so to break 


their correspondence |DFJ* 15}|DFJ* 10} {LLZM12! [XTO6]. 


Perturbative techniques distort a dataset by modifying its informative content. Such tech- 
niques do not preserve data truthfulness, and hence modifications should be reduced and not 
compromise the possibility of correctly performing analysis (1.e., the results of the analysis 
carried out on the perturbed data should not significantly differ from those computed on the 
original one). Examples of perturbative techniques include noise addition and microaggre- 
gation. Noise addition intuitively adds non-deterministic, controlled noise to the original 
data collection before release. Protection is provided by the fact that some values (or com- 
binations among them) included in the released table might not correspond to real ones due 
to perturbation, and by the fact that some values (or combinations among them) included in 
the original table might not be included in the released one. The degree of noise addition 
(e.g., sampling from a standard normal distribution vs. sampling from a uniform distribu- 
tion) is adapted to balance a trade-off between utility and data protection. Microaggregation 
(originally proposed for continuous numerical data and then extended also to categorical 
data [Tor04]) selectively replaces original records with new ones. Microaggregation oper- 
ates by first clustering the records in the original dataset in groups of a certain cardinality in 
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such a way that records in the same cluster are similar to each other, and then by replacing 
the records in a cluster with a representative one computed through an aggregation operator 
(e.g., Mean or median). 


The protection techniques illustrated above can be adopted to effectively protect a dataset. A 
straightforward recommendation for a single technique covering all analytical requirements and 
any data collection is not possible, however, we detail and compare the different techniques in this 
Deliverable to provide guidance. Given a data collection to be protected and released, some key 
questions then need to be answered: what technique should be used? Should a joint combination 
of techniques be preferred to a single one? To which portion of the data (e.g., the entire dataset, 
a subset of records, a subset of attributes) should the technique be applied? Whatever the answer 
to these questions, an important observation is that all protection techniques cause an inevitable 
information loss: non-perturbative techniques produce datasets that are not as complete or precise 
as the original ones, and perturbative techniques produce datasets that are distorted. For these 
reasons, it is necessary to define protection approaches that satisfy a privacy requirement via a 
controlled adoption of some of these protection techniques while limiting information loss, as 
illustrated in the remainder of this Deliverable. 
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3. Anonymization of Data 


As mentioned in the previous chapter, anonymization represents a possible strategy for sanitizing 
a dataset for protecting the privacy of the individuals to whom the dataset refers. Clearly, depend- 
ing on the privacy requirement that is to be satisfied by anonymization, different interpretations of 
privacy can exist. A broad classification can distinguish between syntactic and semantic interpre- 
tations [DFLS12]. Syntactic interpretations capture the protection degree enjoyed by respondents 
with a numerical value. Related anonymization approaches aim at satisfying a syntactic privacy 
requirement (e.g., each release of data must be indistinguishably related to no less than a certain 
number of individuals in the population). On the other hand, semantic interpretations are based 
on the satisfaction of a semantic privacy requirement, and the related anonymization approaches 
aim at satisfying a property of the mechanism chosen for releasing the data (e.g., the result of 
an analysis carried out on a released dataset must be insensitive to the insertion or deletion of a 
record in the dataset). We illustrate some anonymization approaches that follow a syntactic privacy 
interpretation in Section 3. I]and semantic privacy interpretation in Section 3.2] 


3.1 Syntactic Privacy 


We now illustrate some anonymization approaches that build on, and satisfy, a syntactic definition 
of privacy. We start from the first proposal in this direction (k-anonymity, Section [3.1.1) and 


discuss some extensions (Sections 3.1.3). 


3.1.1 k-Anonymity 


The first approach for anonymizing a dataset, originally framed in the context of microdata pub- 
lishing and aiming to protect against identity disclosure (see Section|2. 1), is represented 
by k-anonymity [Sam01]. k-Anonymity enforces a protection requirement typically applied by sta- 
tistical agencies, which demands that any released information be indistinguishably related to no 
less than a certain number k of respondents. Following the assumption that re-identification takes 
advantage of the quasi-identifying attributes (see Section 2), such general requirement is trans- 
lated into the k-anonymity requirement: each release of data must be such that every combination 
of values of quasi-identifiers can be indistinctly matched to at least k respondents [Sam01]. A 
dataset satisfies the k-anonymity requirement if and only if each record in the released dataset 
cannot be related to less than k individuals in the population, and vice-versa (i.e., each individual 
in the population cannot be related to less than k records in the dataset). These two conditions 
hold since the original definition of k-anonymity assumes that each respondent be represented by 
at most one record in the released dataset and vice-versa (i.e., each record includes information 
related to one respondent only). Verifying the satisfaction of the k-anonymity requirement would 
require to have knowledge of all existing external sources of information that an adversary might 
use for the linking attack. This assumption is indeed unrealistic in practice, and k-anonymity takes 
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SSN | Name | DoB | Sex| ZIP | Disease 
1960/05 | * | 100** | stroke 
1960/05 | * | 100** | dyspepsia 
1960/05 | * | 100** | achlorhydria 
1960/05 | * | 100** | epilepsy 
1955/09 | * | 100** | helicobacter 
1955/09 | * | 100** | helicobacter 
1955/09 | * | 100** | helicobacter 
1955/09 | * | 100** | helicobacter 
1955/12 | * | 100** | dermatitis 
1955/12] * | 100** | retinitis 
1955/12 | * | 100** | dermatitis 
1955/12 | * | 100** | gastritis 
1960/04 | * | 100** | stroke 
1960/04 | * | 100** | labyrinthitis 
1960/04 | * | 100** | gastritis 
1960/04 | * | 100** | dyspepsia 


Figure 3.1: An example of 4-anonymous dataset assuming QI={DoB, Sex, ZIP} 


therefore a safe approach requiring that each respondent be indistinguishable from at least k — 1 
other respondents in the released dataset. A dataset is then said to be k-anonymous if each com- 
bination of values of the quasi-identifier appears in it with either zero or at least k occurrences. 
Since each combination of quasi-identifying values is shared by at least k different records in the 
dataset, each respondent cannot be associated with less than k records in the released dataset and 
vice-versa, thus satisfying the original k-anonymity requirement. Traditional approaches to en- 
force k-anonymity operate on quasi-identifying attributes by modifying their values in the dataset 
to be released, while leaving sensitive and non-sensitive attributes as they are (let us recall that 
direct identifiers are removed as the first step). Among the possible data protection techniques 
that might be enforced on the quasi-identifier, k-anonymity typically relies on the combined adop- 
tion of generalization and suppression, which have the advantage of preserving data truthfulness 
when compared to perturbative techniques (e.g., noise addition, see Section 2.2). Let us recall 
that generalization operates by replacing values with more general ones (e.g., a complete date of 
birth might be generalized to the year of birth), while suppression operates by selectively remov- 
ing values. Suppression is used to couple generalization as it can help in reducing the amount 
of generalization to be enforced for achieving k-anonymity. This way, it is possible to produce 
more precise (though incomplete) datasets. The intuitive rationale is that if a dataset includes a 
limited number of outliers (i.e., quasi-identifying values with less than k occurrences) that would 
force a large amount of generalization to satisfy k-anonymity, then these outliers could be more 
conveniently removed from the dataset, improving the quality of released data. For instance, con- 
sider the dataset in Figure[2.I[a) and assume that the quasi-identifier is composed of attribute ZIP 
only. Since there is only one person living in 10180 area (11th record), to achieve k-anonymity 
with k > 1 attribute ZIP should be generalized removing the last three digits. However, if the 11th 
record in the dataset is suppressed, 8-anonymity can be achieved by generalizing the ZIP code 
removing only the last digit. 

Generalization and suppression can be applied at different granularity levels. For instance, 
with reference to relational tables, generalization can be applied at the cell and attribute levels, 
and suppression at the cell, attribute, and record levels. The combined use of generalization and 
suppression at different granularity levels produces different classes of approaches to enforce k- 
anonymity [CDFS07]. The majority of the approaches available in the literature adopt attribute- 
level generalization and record-level suppression [Sam01]. Figure B.1]illustrates 
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Age Disease Age Disease 
20 flu 25 flu 
25 gastritis 25 gastritis 
30 dermatitis 25 dermatitis 
35 stroke 40 stroke 
40 dyspepsia 40 dyspepsia 
45 asthma 40 asthma 


(a) (b) 


Figure 3.2: An example of a dataset (a) and of a 3-anonymous version of it (b) obtained adopting 
microaggregation and assuming QI={Age} 


a 4-anonymous version of the dataset in Figure P.IJa), obtained through attribute-level general- 
ization (DoB, Sex, and ZIP have been generalized by removing the day of birth, sex, and last two 
digits of the ZIP code, respectively) and record-level suppression (the 11th record related to Kathy 
has been suppressed). Symbol x represents any value in the attribute domain. Cell-level gener- 
alization has also been investigated as an approach to produce k-anonymous datasets, as it has 
been shown to reduce the information loss with respect to attribute-level generalization [LDRO6]. 
These approaches have however the drawback of producing datasets where the values in the cells 
of the same column may be heterogeneous (e.g., some records report the complete date of birth, 
while other records only report the year of birth). 


Regardless of the different level at which generalization and suppression are applied, they 
inevitably cause a certain amount of information loss (the original informative content is either 
reduced in the details or removed), and it is therefore essential to compute a k-anonymous dataset 
that, while protecting users” privacy, is still useful to the recipients. To this aim, it is necessary to 
compute an optimal k-anonymization minimizing generalization and suppression, which has been 
shown to be an NP-hard problem [CDFS07], and both exact and heuristic algorithms have been 
proposed. 


As a last remark on k-anonymity, we note that generalization, while having the advantage 
of preserving data truthfulness, can face scalability issues especially for high-dimensional data, 
where generalization might need to cover a high number of dimensions [Agg05]. Some recent 
approaches have been proposed to obtain k-anonymity through microaggregation instead of gen- 
eralization (see Section [2.2) [DT05b]/SDSM14]. To this aim, the QI undergoes microaggregation, 
so that each combination of QI values in the original dataset is replaced with a microaggregated 
version. For instance, consider the dataset in Figure B.2[a) and suppose that the quasi-identifier 
includes attribute Age, while Disease is the sensitive attribute. Figure B.2]b) illustrates a 3- 
anonymous version of the dataset in Figure B.2[a) obtained through microaggregation: the origi- 
nal QI values have been grouped in two clusters (the first three records in one, and the last three 
records in the other one) according to their similarity (in this example, according to an ordering 
over them), and the values in each cluster are then replaced with a representative aggregate value 
(in this example, the mean). Note that, since microaggregation is a perturbative protection tech- 
nique, k-anonymous datasets computed adopting this approach do not preserve data truthfulness 
(e.g., all records in Figure B.2b) but the second and the fifth are not real records according to the 


original values in Figure[3.2{a)). 
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3.1.2 /-Diversity 


While k-anonymity represents an effective solution to protect respondent identities, it does not 
protect against attribute disclosure [Sam01]. A k-anonymous dataset can, in fact, still be vulner- 
able to attacks allowing a recipient to determine (with either full confidence or non-negligible 
probability) the sensitive information of a respondent. In particular, two attacks that may cause 
attribute disclosure in a k-anonymous dataset are the homogeneity attack and 
the external knowledge attack [MKGVO7], as follows. 


e Homogeneity attack. The homogeneity attack occurs when all the records in an equivalence 
class (i.e., the set of records with the same value for the quasi-identifier) in a k-anonymous 
dataset assume the same value for the sensitive attribute. If a data recipient knows the 
quasi-identifier value of a target individual, she can identify the equivalence class represent- 
ing her, and then discover the value of her sensitive attribute. For instance, consider the 
4-anonymous dataset in Figure [3.1] and suppose that a recipient knows that Gloria is a fe- 
male living in 10039 area and born on 1955/09/10. Since all the records in the equivalence 
class with quasi-identifier value equal to (1955/09, x, 100 * *) assume value helicobacter 
for attribute Disease, the recipient can infer that she suffers from a helicobacter infection. 


e External knowledge attack. The external knowledge attack occurs when the data recipient 
possesses some additional (not included in the k-anonymous dataset) knowledge about the 
respondent, and can use it to reduce her uncertainty about the value of the sensitive attribute 
of a target respondent. For instance, consider the 4-anonymous dataset in Figure and 
suppose that a recipient knows that her neighbor Mina is a female living in 10045 area and 
born on 1955/12/30. Observing the 4-anonymous dataset, the recipient can only infer that 
her neighbor suffers from dermatitis, retinitis, or gastritis. Suppose now that the recipient 
sees Mina tanning without screens at the park every day: due to this external information, 
the recipient can exclude that Mina suffers from dermatitis or retinitis, discovering that she 
suffers from gastritis. 


The original definition of k-anonymity has been extended to £-diversity to the aim of counter- 
acting these two attacks. The idea behind é-diversity is to consider also the values of the sensitive 
attributes when clustering the original records, so that at least £ well-represented values for the 
sensitive attribute be included in each equivalence class [MKGVO7]. While several definitions 
for “well-represented” values have been proposed, the simplest formulation of ¢-diversity requires 
that each equivalence class be associated with at least £ different values for the sensitive attribute. 
For instance, consider the 4-anonymous and 3-diverse dataset in Figure[3.3]and suppose that a re- 
cipient knows that her neighbor Mina, a female living in 10045 area and born on 1955/12/30, tans 
every day at the park (see example above). The recipient can now only exclude value dermatitis, 
but she cannot be sure about whether Mina suffers from gastritis or a helicobacter infection. 

The problem of computing an £-diverse dataset minimizing the loss of information caused 
by generalization and suppression is computationally hard. However, since ¢-diversity basically 
requires to compute a k-anonymous dataset (with additional constraints on the sensitive values), 
any algorithm proposed to compute a k-anonymous dataset that minimizes loss of information can 
be adapted to guarantee also £-diversity, simply controlling if the condition on the diversity of the 
sensitive attribute values be satisfied by all the equivalence classes [MKGVO7]. 

As a last remark on £-diversity, we underline an approach for its enforcement that departs 
from generalization, adopting instead a bucketization-based approach (see Section 2.2). For in- 
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SSN | Name | DoB | Sex| ZIP | Disease 
1955 | M | 100** | helicobacter 
1955 | M | 100** | helicobacter 
1955 | M | 100** | dermatitis 
1955 | M | 100** | retinitis 
1960| F | 100** | stroke 
1960 | F | 100** | epilepsy 
1960 | F | 100** | stroke 
1960 | F | 100** | labyrinthitis 
1955 | F | 100** | helicobacter 
1955 | F | 100** | helicobacter 
1955 | F | 100** | dermatitis 
1955 | F | 100** | gastritis 
1960 | M | 100** | dyspepsia 
1960 | M | 100** | achlorhydria 
1960 | M | 100** | gastritis 
1960 | M | 100** | dyspepsia 


Figure 3.3: An example of 4-anonymous and 3-diverse dataset assuming QI={DoB, Sex, ZIP} 


DoB [Sex| ZIP |GroupID GroupID Disease Count 
1955/09/01| M [94143 Gl Gl helicobacter| 2 
1955/09/02| M 94142. Gl Gl [dermatitis 1 
1955/12/07| M 94130. Gl Gl  [retinitis 1 
1955/12/05| M 94131. Gl G2 [stroke 2 
1960/05/02| F [94141 G2 G2 [epilepsy 1 
1960/05/05| F |94144| G2 G2  jlabyrinthitis | 1 
1960/04/02| F |94136| G2 G3 [helicobacter] 2 
1960/04/05| F (94134. G2 G3 dermatitis 1 
1955/09/10| F |94139| G3 G3 [gastritis 1 
1955/09/20| F |94130| G3 G4 [dyspepsia 2 
1955/12/25| F [94142 G3 G4  Jachlorhydria| 1 
1955/12/30| F 94145. G3 G4 [gastritis 1 
1960/05/20) M |94132| G4 
1960/05/12) M |94137| G4 
1960/04/10) M |94147| G4 
1960/04/30) M |94148| G4 


Figure 3.4: An example of a bucketized 3-diverse relation assuming QI={DoB, Sex, ZIP} 


stance, Anatomy (but also other more general techniques that can handle the £-diversity 
requirement [DFJ 10}) is a bucketization-based approach enforc- 
ing ¢-diversity without relying on generalization. With this approach, the records in the original 
dataset are first partitioned in groups that satisfy £-diversity. All buckets so created are then la- 
beled with their own group identifier, and the original dataset is split into two fragments, in such 
a way that one includes the attributes composing the quasi-identifier, and the other includes that 
sensitive attribute. Each record is associated in each fragment with the identifier of the group to 
which it belongs, and each group in the fragment storing the sensitive attribute includes a record 
only for each sensitive value appearing in the group and the frequency with which the value is 
represented in the group. For instance, Figure [3.4|reports a bucketization-based 3-diverse version 
of the original dataset in Figure|2.1[a) computed with the Anatomy approach: it is easy to see that 
the protection guarantees offered by the fragments are the same as those offered by the 3-diverse 
dataset in Figure[3.3} computed instead through traditional generalization. 
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3.1.3 t-Closeness 


Although ?-diversity represents a first step in counteracting attribute disclosure, an ¢-diverse dataset 
might still be vulnerable to information leakage caused by skewness and similarity attacks [LLVO07], 
as follows. 


e Skewness attack. The skewness attack occurs when a data recipient can observe significant 
differences in the frequency distribution of the sensitive values within an equivalence class, 
with respect to the frequency distribution of the same values in the population (or in the 
whole dataset). The idea is that differences in these distributions signal changes in the 
probability with which a respondent in the equivalence class is associated with a specific 
sensitive value. As an example, consider the 3-diverse dataset in Figure[3.3]and suppose that 
a recipient knows that Alice is a female living in 10041 area and is born on 1960/05/02. Since 
two out of the four records in the equivalence class with quasi-identifier value (1960, F, 100x 
*) assume value stroke for attribute Disease, it is possible to infer that Alice has 50% 
probability of having had a stroke, compared to the 12.5% of the whole dataset. 


e Similarity attack. The similarity attack is caused by the fact that -diversity only requires 
that the £ values in an equivalence class be sintactically similar, without constraints on their 
semantics. This attack occurs when, in an ¢-diverse dataset, the sensitive values of the 
records in an equivalence class are semantically similar, although (as required by £-diversity) 
syntactically different. For instance, consider the 3-diverse dataset in Figure[3.3Jand suppose 
that a recipient knows that Carl is a male living in 10037 area and born on 1960/03/12. 
Observing the 3-diverse dataset, the recipient can infer that Carl suffers from either gastritis, 
dyspepsia, or achlorhydria and, therefore, that Carl suffers from a stomach-related disease. 


The definition of t-closeness has been proposed to counteract these two attacks by 
requiring that the frequency distribution of the sensitive values in each equivalence class be close 
(1.e., with distance smaller than a fixed threshold 1) to that in the whole dataset. Ensuring t- 
closeness ensures that the skewness attack has no effect, since the knowledge of the quasi-identifier 
value for a target respondent does not change the probability for a malicious recipient of correctly 
guessing the sensitive value associated with the respondent. t-Closeness also reduces the effective- 
ness of the similarity attack, because the presence of semantically similar values in an equivalence 
class can only be due to the presence of the same values in the dataset with similar relative fre- 
quencies, thus not increasing the knowledge of the recipient. 

The enforcement of t-closeness requires to evaluate the distance between the frequency dis- 
tribution of the sensitive attribute values in the whole dataset and in each equivalence class, and 
several distance metrics can be used to this aim [LLVO7]. 


3.2 Semantic Privacy 


A key insight on which semantic privacy was formulated is the impossibility proof of Dalenius 
desideratum, which demands nothing about an individual should be learnable from the database 
that cannot be learned without access to the database [Dwo06}. In the presence of auxiliary infor- 
mation, an adversary will always be capable of inferring some information about an individual in 
a dataset given some function result computed from this dataset. Differential privacy, a semantic 
privacy definition [Dwo06], provides a metric that quantifies the privacy risk incurred by partic- 
ipating in a dataset. Concretely, differential privacy measures how plausibly an individual can 
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deny membership in a dataset. In contrast to previous anonymization methods based on general- 
ization, differential privacy achieves anonymization of a dataset D = {d,...,d,} by perturbation. 
Thus, differential privacy does not provide truthfulness in comparison to the previously introduced 
concepts for generalization. 

Differential privacy can be enforced either locally on each entry in D, or centrally on the result 
of a query function f(-) over D. Within the next subsections we will introduce the foundations of 
differential privacy, and the mechanisms that achieve it. 


3.2.1 e-Differential Privacy 


Differential privacy is frequently enforced in the central setting. The central setting comprises 
three actors: Data Owner, Data Analyst and Curator. 

Here, Data Owner possesses a dataset D, from data domain DOM, containing sensitive or 
personally identifiable information. D shall be shared with Data Analyst such that Data Analyst is 
able to evaluate a query function f(D). However, Data Analyst shall be prevented from learning 
the original result of f(D). In the central model, the query function f(-) is thus evaluated and 
perturbed by Curator (trusted third party server) such that it is no longer possible to confidently 
determine whether f(-) was evaluated on D, or some neighboring dataset D’ differing in one 
individual. Mechanisms M fulfilling Definition|1|are used for perturbation of f(-). 


Definition 1 (€-Differential Privacy [Dwo06]) A mechanism M gives e-Differential Privacy if 
for all D,D’ E DOM differing in at most one element, and all sets S E Range(M) 


Pr[M(D) € S| < e? -Pr[M(D”) € S], 
where Range(M) denotes the set of all possibles outputs of mechanism M. 


Throughout MOSAICrOWN we will refer to € as privacy parameter. While the choice and 
interpretation of € depends on the concrete dataset D at hand, small e values tend to result in high 
privacy, and vice versa. 

Perturbation is influenced by sensitivity. According to Dwork and Roth [DR14], sensitivity 
can be interpreted as the maximum impact that a single individual can have on the return value of 
query function f. Definition [I]holds for all possible differences | f(D) — f(D’)| by adapting to the 
global sensitivity of f (-) per Definition 2] The absolute nature of global sensitivity implies that an 
individual’s impact on the result of a query function will never be greater than Ay. 


Definition 2 (Global Sensitivity) Let D and D' be neighboring. The global sensitivity of a func- 
tion f (+), denoted by Ay, is defined as 


Ay = maxp p| f(D) — f(D’)|. 


In the following we will introduce two common mechanisms for adding e-differentially private 
noise: the Laplace mechanism for numeric perturbation and the exponential mechanism for the 
perturbation of numeric and categorical values. The use of a respective mechanism is mostly 
motivated by the insensitivity and stability of input data in regards to noise and by the enforcement 
model. 

The Laplace mechanism of Theorem]I]is suited for the enforcement of e-differential privacy 
on numerical valued queries which provide the analyst with a real valued answer. An example for 
such an operation is a count query. 
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probability D ={a,a} D' = {a, b, a} 


a 


a 


> 
x mechanism 
0=fw(D)  1= f(D’) puina 


Figure 3.5: Laplace mechanism example: count query 


Theorem 1 (Laplace Mechanism [DR14]) Given a numerical query function f : DOM > R5, 
the Laplace mechanism 


Mtap(D, f,€) = f(D) E (21 --+52x) 


is an €-differentially private mechanism when all z; with 1 < i < k are independently drawn from 
Z~ Lap(z,A =f," =0). 


For the proof we refer to Dwork et al. [DR14]. As the name already indicates, 
the Laplace mechanism samples noise from an underlying Laplace distribution. The Laplace 
distribution is a symmetric exponential distribution centered around mean u = 0 with scaling 
factor 2. The Laplace mechanism adds noise with scale A = Ay/€. Therefore, sensitivity Ay is a 
factor in determining how accurately queries results can be published while preserving a desired 
level of privacy [DR14]. The scale A is growing (1) as the sensitivity Af increases for a given 
privacy parameter e or (2) as the privacy parameter e decreases for a given sensitivity Af. 

We now want to discuss an example. Assume a count function fyp(-) for value b, original 
dataset D = {a,a} and a neighboring dataset D' = {a,b,a}. Without the application of differential 
privacy we would observe two deterministic outputs: 


a f fun (D) (3.1) 


1 fyp(D’) 


By using the Laplace mechanism we span a Laplace distribution around fy,(D) and fx,(D) and 
thus the function result x becomes non-deterministic as depicted in Figure [3.5] 

e-differential privacy assures that for any function result x the divergence of probability be- 
tween stemming from D and D’ is bounded: 


Pr[Mzap (feb (D")) =x] < e? 
Pr[Mrap (Se (D)) =x] ~ 


Besides the additive Laplace mechanism the Exponential Mechanism, provided in Defini- 
tion B} is a widely used choice for arbitrary perturbation of categorical and numerical data. This 
mechanism is useful in situations when adding noise to the result of a query destroys its value [DRIA]. 
Instead, the exponential mechansim samples from a more meaningful, predefined range of possible 
outputs. 
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Definition 3 (Exponential Mechanism [LLSY16]) For any quality function q : D x O —> R, where 
is the set of all possible datasets, and a privacy parameter e, the exponential mechanism M. a (D) 


outputs o € O with probability proportional to exp (ee ), where for all datasets D,D' differing 


in one record 


== _ / 
Aq = oe |q(D,o) —q(D',o)| 


is the sensitivity of the quality function. That is, 


Pr[M¿(D) = 0] = oa” 
Loco exp ( | 

We refer the interested reader to McSherry and Talwar for the initial proof that the ex- 
ponential mechanism satisfies e-differential privacy. Note that the exponential mechanism allows 
to encode a preference towards values close to the true value through the quality function. 

We now want to revisit and advance an example initially outlined by Cormode for the 
use of the exponential mechanism. Assume we want to compute the (lower) median of a sorted 
array of an even number unique elements: 


D = {1,3,5,7,8,11,12, 13,17,22}. 
Consequently, the rank of each element in D is: 
rankp(-) = 0,1,2,3,4,5,6,7,8,9, 


thus, the true lower median is represented by 8. For the application of the exponential mechanism 
we first need to define a quality function that assign a high score to elements in D which have a 
rank close to the true lower median: 

IDI 


q(-,D) = |—rankp(-) — > F ] „yielding 


= 4, -3,=2,—1,0;=1,=2,—3,—4, —5. 


In the exponential mechanism quality function scores become weights: 


weight(-) = exp (S4(-.D)) , yielding 


Figure [3.6] illustrates the resulting weight distribution. Note that the distribution is centered 
around the element with the highest utility (1.e., the original median). The shape of this distribution 
yields high utility, since probability falls sharply as the quality score decreases [DR14]. 

The Fundamental Law of Information Recovery states that privacy cannot be guaranteed if 
overly accurate answers to too many questions are made public [DNO3], and this disintegration 
of privacy must be measured. A beneficial characteristic of differential privacy is the ability to 
quantify the privacy loss of an individual within a dataset over a series of €-differentially private 
mechanism evaluations. This characteristic is referred to as composition. The most basic compo- 
sition theorem is sequential composition as stated in Theorem {2} 
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Figure 3.6: Exponential mechanism example: median query 


Theorem 2 (Sequential Composition [DR14]) Let Mı, M2, ..., Mx be k algorithms (that take aux- 
iliary inputs) that satisfy €\-DP, e2-DP...., &-DP. respectively, with respect to the input dataset D. 
Publishing 


t =< ti,b,...,tk >, where ti —Mi(D) 42 = M2(t1,D),...,tk = (< ti,- .-,tk-1 5D) 
satisfies (Y*_, £;)-DP. 


Sequential composition is a naive pessimistic view, assuming that the privacy loss (i.e., in- 
formation gain of Data Analyst) is maximum in each invocation of a mechanism. There is a 
series of results that yield tighter bounds on the privacy loss by analyzing it as a random vari- 


able [DRV10||ACG* 16} |[KOV17]. 


3.2.2 (€,6)-Differential Privacy 


Definition [I]is very strict with respect to only allowing relative differences in probabilities of all 
possible outputs between a database D and any neighboring database D’. This strictness leads to 
the inapplicability of some statistical distributions for e-differential privacy, and might especially 
have a severe effect on utility (e.g., trading in a small loss in privacy for a large gain in utility by 
suppressing low probability elements from the set of possible outputs). 

(€,6)-differential privacy introduces an additional, additive privacy parameter 6 besides the 
relative privacy parameter e to mitigate the previous arguments. We formalize (€, 6)-differential 
privacy in Definition] 


Definition 4 ((€, 6 )-Differential Privacy [DR14]) A mechanism M gives (€,6)-Differential Pri- 
vacy if for all D,D' C DOM differing in at most one element, and all outputs S C Range(M) 


Pr[M(D) € S] < ef -Pr[M(D”) € S] +6, 
where Range(M) denotes the set of all possibles outputs of mechanism M. 


6 allows a larger difference between the probability of mechanism output values and thus 
relaxes the strict definition of e-differential privacy. (¢,6)-differential privacy offers a relaxed, 
weaker guarantee compared to e-differential privacy with the same value of e. In fact, e-differential 
privacy can also be written as (€, ô = 0)-differential privacy. Thinking of Definition|4]as providing 
e-differential privacy in 1 — 6 percent of all function evaluations it is clear to see why literature 
demands 6 to be < mi 

The Gauss mechanism of Theorem [3|is frequently used to provide (€, 9 )-differential privacy 
for real valued function. The Gauss mechanism uses global (3-sensitivity Ay, as formalized in 
Definition 5] 
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(a) Central setting. (b) Local setting. 


Figure 3.7: Comparison of trust boundary between central and local setting. 


Definition 5 (Global />-Sensitivity) Let D and D' be neighboring. The global l2-sensitivity of a 
function f(-), denoted by A fo, is defined as 


Ap = maxp,p\ f(D) — f(D')|2. 


Theorem 3 (Gauss Mechanism [DR14]) Given a numerical query function f : DOM" —>R*, the 
Gauss mechanism 
MGau(D, f,€, ô) = f(D) + (zı , 06684) 


is an (€,6)-differentially private mechanism for any e,8 € (0,1) when all z; with 1 < i < k are 


independently drawn from Z ~ N(0,0?), with o > ch and c? > 2In(+5). 


In contrast to the Laplace distribution the Gauss distribution does not enjoy the sliding prop- 
erty. Here, the guarantee of an observation being within a factor of e? between the likelihoods of 
D and D’ no longer holds for large noise values occurring in the distribution tails, bounded by ô. 

Theorem|2|can be extended to hold for (€, 5)-differential privacy by considering Y*_, 6. 


3.2.3 €-Local Differential Privacy 


Until now, we limited our discussion of differential privacy to the central setting where privacy is 
enforced by Curator, a trusted third party, by the application of differentially private mechanisms. 
However, given the data market focus of MOSAICrOWN, we also investigate stronger options 
without this trust assumption (and less accuracy), providing a choice for the market participants. 
We will refer to the setting that includes only two actors, Data Owner and Analyst, as local setting. 

Figure [3.7] provides a comparison of the actors and Data Owner trust boundary (dotted, blue 
line) for the central and local setting. In Figure B-7[b) we can observe that in the local setting they 
exchange anonymized data, which is similar to the release of microdata discussed in the previous 
chapter. 

To apply differential privacy in the local setting we introduce €-Local Differential Privacy in 
Definition [6] Instead of bounding the difference between probability distributions from which a 
query answer could have been produced, ¢-LDP demands that for any possible input value from 
a domain DOM an e-LDP mechanism returns any possible value v2 from DOM with non-zero 
probability. 


Definition 6 (£-Local Differential Privacy) An algorithm M satisfies e-Local Differential Pri- 
vacy (€-LDP), where € > 0, if and only if for any input vı and v2, we have 


VS € Range(M) : Pr[M(v1) € S] < exp(e) - Pr[M (v2) € S] 


where Range(M) denotes the set of all possible outputs of the algorithm M. 
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We will now illustrate the application of £-LDP by discussing Randomized Response [War65]. 
Randomized response is a technique that provides plausible deniability to survey participants. 
Assume a survey containing sensitive questions that can only be answered yes or no. The survey 
participant is requested to throw a coin per question. If the coin shows heads, the participant 
will report his true answer. Else, the participant will throw the coin again and report yes in case 
of heads and otherwise no. This algorithm fulfills e-LDP, since any true input can result in any 
output from the domain { Yes, No}. To be more specific, for a fair coin the algorithm is € = In(3) 
differentially private since: 


Pr[r = Yes| truth = Yes] _ 3/4 


= =3=éf, 
Pr[r = Yes|truth =No] 1/4 


Based on the noise generation (i.e., fair coin flip) we can reason about the true counts contained 
in the noisy responses and improve the accuracy as follows. Assuming we want to approximate 


the true fraction of yes answers, which we denote p,. The expected portion y of yes values under 
1 

5. 
(probability 1/2) and a random yes is reported when the flipped coin shows tails and a second 


a fair coin is 5: py + L, i.e., the true frequency (py) is reported when the flipped coin shows heads 
flip heads (probability 1/4). Given this equation, we can approximate the actual portion py of yes 
answers as 2y + ie Thus, these noisy responses allow us to approximate true answers of a group 
while providing privacy for each individual. 

Protocols extending randomized response have for example seen adoption for small value 
domains such as reporting user statistics (e.g., Erlingsson et al. [EPK14]). However, adaption is 
problematic in case of large value domains. 


3.2.4 Discussing Utility and Privacy 


We recall that semantic privacy is motivated by the impossibility of releasing useful statistics about 
a dataset without disclosing insights about the underlying respondents in a dataset. An extreme 
example illustrates usefulness and revisits the differentially private median from Section 3.2.1] 
To provide the technically strongest data protection, we could assign the same weight to every 
entry o in the set of possible outputs O from the exponential mechanism. This would essentially 
construct a uniform distribution and prevent any usefulness of the released data. Thus we decided 
to specify a probability distribution that encodes a preference towards useful outputs (i.e., close to 
the true median). This preference can be steered by adapting e: by letting € become small (= 0) 
we increase the privacy and decrease the usefulness, by letting € grow we decrease the privacy and 
increase the usefulness. However, note that the effect of an increase/decrease in € is relative with 
respect to the underlying dataset and the evaluated function (cf. Definition [I}[KM11]) and thus 
there is no absolute guidance for good or bad € values. 

Besides the privacy parameter e, the sensitivity Ay significantly affects usefulness of differen- 


tially private results. Assume we are provided a dataset D containing employee salaries for SAP 

Ed 
ID] 

decompose the above computation into two functions: fy (D) with sensitivity Ay = 10,000, 004] 


SE, and now want to provide the average salary with e = 2 differential privacy. We can 
and fy(D) with sensitivity Af = 1. The sequential composition theorem (Definition [2) allows us 
to to split our budget of € = 2 and thus we can compute the DP average by applying the Laplace 
mechanism to fy (D) and f(D) with e = 1 each: 


‘The highest individual salary in 2018 was EUR 9.4 Mio.; SAP SE Annual Report on Form 20-F 


sap.com/docs/download/investors/2018/sap- 2018-annual-report-form-20f .pdf 
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A,=10,000,000 
f(D) + Lap (“258 0) 


f(D) + Lap ($E) 


We directly observe that the noise scale is significantly larger for fy (D) than it is for fy(D) and 
thus will provide more noise and significantly impact usefulness. Thus it is of main importance 


to investigate functions with small Ay or bounding Ay when applying DP, e.g., by clipping the 
maximum individual salary in our example. 
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4. Quantifying Privacy for Machine Learning 
with Membership Inference 


Machine learning is ubiquitous in software applications nowadays and the sharing of trained ma- 
chine learning models with third parties is relevant for data markets. However, the success of 
machine learning (ML) depends as much on sophisticated algorithms as it does on the availability 
of large sets of training data. Gathering sufficient amounts of training data for satisfying model 
generalization has proven cumbersome especially for sensitive data and, in some cases, resulted 
in privacy violations due to data misuse (e.g., the inappropriate legal basis for the use of Na- 
tional Health Service (NHS) data in the DeepMind project IHMDD19]]). Furthermore, 
privacy threats are arising since the desire to identify on which data a model was trained gives 
rise to two attack categories against machine learning models: model inversion, which aims for 
reconstructing a training dataset with missing parts ITZJ* 16}, and membership 
inference (MI) [SSSS17]. The latter attack is striving to identify whether an individual, or a set of 
individuals, belong to a certain training dataset; thus, it is especially relevant for MOSAICrOWN. 
We argue that membership inference is relevant for two actors: an adversary performing single 
record MI and a regulator performing set MI. Single MI is used in previous work to model an 
adversary who is mainly interested in identifying individuals within a dataset. However, set MI is 
relevant for regulatory audits since it can be used to prove that a specific set of records was used 
to train a model. If the practitioner who trained the model was not authorized to use a specific 
dataset for this purpose, regulators can apply set MI to prove data privacy violations. 


A mitigation to privacy violations and privacy threats is offered by anonymization with differ- 
ential privacy (DP) during machine learning training. Within this section we consider two machine 
learning architectures that have been investigated for use in data markets throughout the first half 
of MOSAICrOWN: feedforward neural networks for classification and generative modeld! |for data 
generation. For each architecture we discuss how privacy can be quantified by using membership 
inference attacks and how privacy can be ensured by using differential privacy. 


The results presented in this chapter have been published as conference paper and 
a preprint [BGRK19]. This chapter is structured as follows. First, we give some notation and 
preliminaries in Section [4.1] In Section [4.2] we introduce membership inference threat models. 
In Section [4.3] we suggest to compare privacy guarantees under a single membership inference 
adversary for feedforward neural networks. In Section|4.4] we introduce and formalize two attacks 
which are applicable to both single and set membership inference against generative models. To 
this end, details regarding GANs and VAEs are provided. 


' Generative models are ML models that are trained to learn the joint probability distribution p(X, Y) of features X 
and labels Y of training data. 
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Symbol Description 


X Set of vectors X1, ..., Xj where Xi; bes Xi denote attribute values (features) of X;. 

Y Set of k target variables y;,..., yx (labels). 

C |Y]. 

y Vector of target variables (labels) where variable y; € y represents the label for x; € X. 
$ Predicted target variable, i.e., $ = h(x). 

p(x) Softmax confidence for x. 

D D:= (X,Y). 

d A record d € D, where d := (X,y). 

n pa 


Table 4.1: Notations and context. 


4.1 Preliminaries & Notation 


The set of notations that is used throughout this chapter is summarized in Table [4.1] In contrast to 
anonymization methods based on generalization, differential privacy (DP) anonymizes a 
dataset D = {d1,...,dn} by perturbation. DP can be either enforced locally to each entry d € D, 
or centrally to an aggregation function f(D). In the following, we recall parts of the definition of 
central and local differential privacy from Sections [3.2.1]and[3.2.3]in the new context of machine 
learning. 

Central Differential Privacy. In the central model the aggregation function f(-) is evaluated 
and perturbed by a trusted server. Due to perturbation it is no longer possible for an adversary to 
confidently determine whether f(-) was evaluated on D, or some neighboring dataset D’ differing 
in one element. Thus, assuming that every participant is represented by one element, privacy is 
provided to participants in D as their impact of presence (absence) on f(-) is limited. 

To enforce DP in the central setting (CDP) in deep learning we use differentially private ver- 
sions of two standard gradient optimizers: SGD and Adan] We refer to these CDP optimizers as 
DP-SGD and DP-Adam. A differentially private optimizer represents a differentially private train- 
ing mechanism M,,, that updates the weight coefficients 6, of a neural network per training step 
t ET with 6, + 6,_; — a(g), where g = Munldloss/06,_1) denotes a Gaussian perturbed gradient 
and Q is some scaling function on g to compute an update, i.e., learning rate or running moment 
estimations. Differentially private noise is added by the Gaussian mechanism of Definition [3] as 
suggested by Abadi et al. [ACG*16]. After T update steps, M„n outputs a differentially private 
weight matrix O which is used by the prediction function /(-) of a neural network. A CDP gradient 
optimizer bounds the sensitivity of the computed gradients by a clipping norm C based on which 
the gradients get clipped before perturbation. 

Since weight updates are performed iteratively during training, a composition of M,,, is re- 
quired until the the training step T is reached and the final private weights O are obtained. For 
CDP, we measure privacy decay under composition by tracking the noise levels o we used to 
invoke the Gaussian mechanism. After training, we transform and compose o under Renyi dif- 
ferential privacy [Mir17], and transform the aggregate again to CDP. We chose this accumula- 
tion method over other advanced composition schemes (e.g., Advanced Composition or Moments 


The Tensorflow privacy package: (https: //github.com/tensorflow/privacy 
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Accountant [KOV17] |ACG* 16]) since it provides tighter bounds for heterogeneous mechanism 
invocations. 


Local Differential Privacy. We refer to the perturbation of entries d € D as local differential 
privacy [WBLJ17]. LDP is the standard choice when the server that evaluates a function f(D) is 
untrusted. We adapt the definitions of Kasiviswanathan et al. to achieve LDP by using 
local randomizers LR, i.e., we use Definition 6] with algorithm M = ER in the following. 

In the experiments within this section, we use a local randomizer to perturb each record d € D 
independently. Since a record may contain multiple correlated features (e.g., pixels in an image, 
items in a preference vector) a local randomizer must be applied repeatedly which results in a 
sequentially increasing loss of privacy. A series of local randomizer executions per record com- 
poses a local algorithm according to Definition [7] e-local algorithms are é€-local differentially 
private [KLN+08], where e is a summation of all composed local randomizer guarantees. 


Definition 7 (Local Algorithm) An algorithm is €-local if it accesses the database D via LR with 
the following restriction: for alli € {1,...,|D|}, if LRi(i),...,LRx(i) are the algorithms invoca- 
tions of LR on index i, where each LR; is an €;-local randomizer, then € +... +E < €. 


We perturb low domain data with randomized response [War65|, a (composed) local random- 
izer. According to Equation randomized response yields € = In(3) LDP for a one-time 
collection of values from binary domains (e.g., {yes,no}) with two fair coins [EPK14]. That is, 
retention of the original value with probability p = 0.5 and uniform sampling with probability 


(1—p)-p. 


In our evaluation we also look at image data for which we rely on the local randomizer by 
Fan for LDP image pixelation. The randomizer applies the Laplace mechanism of Defini- 
tion[I} with scale A = = ~ to each pixel. Parameter m represents the neighborhood in which LDP 
is provided. Full neighborhood for an image dataset would require that any picture can become 


any other picture. As a rule of thumb providing DP within large neighborhoods will require high 
e values to retain meaningful image structure, and vice versa. High privacy will result in uniform 
random black and white images. 

Within this section we consider the use of LDP and CDP for deep learning along a generic 
data science process (e.g., CRISP-DM [WHO0]). In such processes, the dataset D of a data owner 
DO is (i) transformed, and (ii) used to learn a model function h(-) (e.g., classification), which 
(iii) afterwards is deployed for evaluation by third parties. In the following, h(-) will represent a 
neural network. DP is applicable at every stage in the data science process. In the form of LDP 
by perturbing each record d € D, while learning h(-) centrally with a CDP gradient optimizer, or 
to the evaluation of h(-) by federated learning with CDP voting [PAE+17]. We focus on the data 
science process without collaboration and keep federated learning for future reference. 

When applying DP in the data science process, the privacy-accuracy trade-off is of particular 
interest. Similar to the evaluation of regularization techniques that apply noise to the training data 
to foster generalization (e.g., [Mat92]) we judge utility by the test accuracy of 
h(-). Le., the accuracy of h(-) on test data after having learned h(-) from training data. 
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4.2 Threat Models in Deep Learning 


In this section, we introduce the threat model called Membership Inference (MI). 


4.2.1 Background of Membership Inference 


The goal of membership inference (MI) is to gather evidence that a specific record or a set of 
records belongs to the training dataset of a given machine learning model. MI thus represents an 
approach for measuring how much a model leaks about individual records of a population beyond 
what it reveals about an arbitrary member of the population. The success rates of MI attacks 
against a model are tightly linked to overfitting (i.e., the generalization error [YFJ17]]). The poorer 
a model generalizes the more specificities it contains about individual training data records. 

In this section, two kinds of MI are considered: single MI and set MI. The single MI is 
comparable to common experiment setups for MI [HMDD19]. In the set MI setting 
a regulator has to recognize which of the two provided sets contains training data records. This 
section considers two actors corresponding to single and set MI, respectively. The first actor is an 
honest-but-curious adversary A and the second actor is a regulatory body R. Each actor focuses 
on a specific task: the adversary A is common in MI literature and infers whether a single record 
was present in the training dataset using single membership inference. The regulatory body R 
performs set membership inference to identify whether a set of records was present in the training 
dataset. This attack can provide evidence that a certain set of training data was illegally used to 
train a generative model. 

Both actors are assumed to have no access to the underlying training dataset of the generative 
model, and they refrain from activities that maliciously modify this target model. 


4.2.2 Adversarial Actor: Single MI 


Single MI has been used by previous work to evaluate attacks against GANs . In this 
setting, the honest-but-curious adversary A has to identify individual records which were used to 
train the model. To this end M records from the training data and M records from the test dataset 
{x1,...,X2} are given. 

The Membership Inference attacks against generative adversarial networks and feedforwards 
that are discussed within this section rely on a function f(x) that can be computed for each of the 
records. The intuition is that this function attains higher values for training data records. Details on 
how this function is realized are given in the corresponding sections. In the following description 
of the attack types we use the general notation Ô (x). 

For every record x;, A has to decide whether it was part of the training data. In general, A 
picks the M records with the M greatest values of the function f(x). 


Attack Type 1 (Single Membership Inference) Let A be an adversary who is able to compute 
the function f(x) for every record x. 


1. Choose records {x,,...,xy} from the training data. 
2. Choose records xm+1,...,x2m) from the test data. 
3. Ais presented the set {x,,...,x2y}. 


4. A labels the M records with highest values f (x;) as training data. 
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We denote the M records chosen by A as [xf,...,x4,). We call the proportion of actual training 
data in this set 


GEIER] 


the accuracy of the attack for single MI. 


4.2.3 Regulatory Actor: Set MI 


Set MI corresponds to the needs of regulators and auditors aiming to prove data privacy violations 
in machine learning. One set consisting of M records from the training data [x,,...,xm) and 
another set consisting of M records from the test data {xm+1,. . . ,X2m} are shown to a regulator R 
in either order. The task of R is to decide which of the two sets is a subset of the original training 
data. Contrary to single MI, R knows which records belong to the same data source (training data 
or test data). However, R does not know which set is a subset of the original training data. 

Similar to single MI R computes the function f (x) for every record and selects the M records 
with the M highest values f(x). For each of the selected records, R checks to which set it belongs 
and eventually selects the set from which most of these records stem as subset of the original 
training dataf'| Note that this is equivalent to taking the set with the higher median. Since we do 
not have any prior knowledge on the type of distribution of the f-values this is more robust than 
considering the mean. 


Attack Type 2 (Set Membership Inference) Let R be an adversary able to calculate the func- 
tion f (x) for every record x. 


1. Choose records {x,,...,xy} from the training data. 

2. Choose records xm+1,...,x2m) from the test data. 

3. R is presented the sets {x,...,xyu} and {xy41,.--,X2u}- 
4. R identifies the M records with highest values f (x;). 

5. R chooses the set from which most of these records stem. 


6. If both have the same number of representatives R picks one set randomly. 


The accuracy of an attack of this type is defined as the average success rate of R, i.e., the 
probability that R identifies the true subset of the training data. 


4.2.4 Relevance for Real-World Use Cases 


The formalized MI attack types are an alternative to assessing a single record x by computing f (x) 
and considering the record part of the training data if the value exceeds a threshold. While the 
single record approach is conceptually similar, the formalized types contributed in this section are 
closer to real-world use cases. For example, in machine learning as a service (MLaaS) applications 
access to both test and training data is implicitly given. Hence, the single MI and set MI attack 
types can be automatically conducted. Increased MI attack accuracies suggest that the model 
quality is insufficient w.r.t. privacy. 


31f an equal number of records belong to the first and the second set, R picks one of the sets with probability 50%. 
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train set of regulator’s MI attack (a) 
suspected illegal use (b) 

actual illegal use (c) 

complete training data used (d) 

test set of regulator’s MI attack (e) 
test data of regulator (f) 


Figure 4.1: Venn diagram of training and test data in the regulatory use case for R. 


Figure [4.1] visualizes the regulatory use case. The regulator R suspects that a certain dataset 
was illegally used to train a model (b). Actually, even more data was used illegally (c). Moreover, 
some legally obtained data might have been used. Together with the illegal data, it represents the 
complete training data (d). R”s set of suspected data is used as train set in the set MI attack (a). 
R also needs test data (f) from which a subset (e) is used as test set for the attack. If the attack is 
successful the illegal use can be proven. Otherwise, the attack does not perform better than random 
guessing. By repeating the attack for multiple choices of subsets (a) and (f) R ensures statistical 
significance. Note that R does not need to know the entire training data since the MI attacks also 
work for subsets of the entire training data. The accuracy does not depend on the concrete subset 
choice as we will show in our experiments in Section[4.4.4] 

Note that in both single and set MI we assume that there are exactly as many test records as 
training records. In the regulatory use case of set MI this is realistic since a sample of the larger 
of the two sets can be used if they are not of equal size. To make the results of single and set 
MI comparable, and to be in line with the balanced setting in previous work [SSSS17], we also 
decided to use this setup in single MI. Note that this is potentially an advantage for A. 


4.3 Quantifying Privacy Risks in Feedforward Neural Networks with 
MI 


While several MI attacks have been formulated, this Deliverable solely refers to the black-box 
single MI attack by Shokri et al as an exemplary membership inference attack against 
feedforward neural networks. Section (4.3. 1]introduces this MI attack. Throughout this section we 
will illustrate how the attack is used against a synthetic dataset that is introduced in Section [4.3.2] 
to assess LDP and CDP privacy parameters. The assessment is done through an experiment in 
Section [4.3.3 


4.3.1 Black-Box MI Attack 


The black-box MI attack assumes an honest-but-curious Adversary A with access to a trained 
prediction function /(-) and predictions from h(-) (e.g., softmax confidence values). We refer to 
the trained ML model against which the MI attack is applied as target model. Within three steps 
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the MI attack exploits that an ML classifier such as a neural network tends to classify a record d 


anger With differing softmax confidence p(x)|/(*) to its true 


train 
target: 


First, data owner DO trains a ML model, target model, for some classification task with 


from the model’s training dataset Df 
label y in comparison to a record d ¢ D 


records from a dataset Diare After training DO exposes the target model to the adversary A for 
inference tasks, e.g., through an API. Second, A trains copies of the target model w.r.t. structure 
and hyper-parameters, so called shadow models, on data statistically similar to De For any 


i Æ j, it applies that 


piran | = [post | 


shadowi shadow; 


A ptrain nptest =Ø 


shadow; shadow; 


A |D shadow: N Dsnadow, | 2 0. 


After training, each shadow model is invoked by A to classify all respective training and test 
data, i.e., p(x), V d € Dido, UDiñadow, - Since A has full control over DEf232,, and Diñadow, > 
each shadow model’s output (p(X), y) is appended with a label “in” if the corresponding record 
d € Di238,,- Otherwise, its label is “out”. 

Third, a binary classifier, attack model, is trained by A per target variable y € Y to map p(X) 
to the indicator “in” or “out”. The triples (p(x), y, in/out) serve as attack model training data, 
i.e., Ditain The attack model thus exploits the imbalance between predictions on d € Dies: 
and d ¢ Deia... 

Finally, the attack model is evaluated on tuples (p(x),9) Vd € Diszger, Which simulates the 
worst case where A tests membership for all training records. 

We find the black-box MI attack to be especially effective when some classes within the train- 
ing data comprise a comparatively low number of records and the overall training data distribution 
is imbalanced. We observed these imbalanced classes especially when learning models from sen- 


sitive or personal training data. 


Evaluating CDP and LDP under MI 


Considering that both DP and MI are tailored to the protection and identification of individual data 
records, we argue for evaluating DP privacy by MI Attack precision and recall, and focus on two 
privacy questions: “How many records predicted as in are truly contained in the training dataset?” 
(precision), and “How many truly contained records are predicted as in?” (recall). We calculate 
A’s MI precision and recall as the average score over the instances of all classes to ensure compa- 
rability to the results of Shokri et al. [SSSS17]. We illustrated that DO has two options to apply 
DP within the data science process. Either LDP by applying a local randomizer on the training 
data and using the resulting LR (Diao) for training, or central DP with a differentially private 
optimizer on Dies A discussion and comparison limited to the privacy parameter € likely falls 
short and potentially leads data scientists to incorrect conclusions. Thus, data scientists give up 
flexibility w.r.t. applicable learning algorithms and may miss a favorable privacy-accuracy trade- 
off, if they rule out the use of LDP due to comparatively greater € and instead solely investigate 
CDP (e.g., DP-SGD). Instead we suggest to compare LDP and CDP by their concrete effect on 
an MI attack. While we consider the MI attack of Shokri et al. our methodology is 
applicable to other MI attacks as well. Depending on the notion of privacy, the MI attack scheme 
described earlier in this section changes slightly. When examining CDP, we train both target and 


shadow models on the same datasets as would be used without any anonymization. However, a 
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Figure 4.2: MI under central DP with DP gradient optimizer. 
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Figure 4.3: MI against LDP in target model training 


similar configured CDP optimizer is used during the training phase for target and shadow models. 
Adversary A obtains higher MI precision and recall when relying on an equally configured DP 
optimizer for shadow model training, compared to use of the non-DP optimizer for shadow model 
training. The LDP setup requires a local randomizer to perturb the training inputs for both target 
and shadow models. See Figure [4.2]and Figure [4.3] for comparison. 


We calculate the relative privacy-accuracy trade-off for LDP and CDP as the relative difference 
between DO’s change in test accuracy to the change in MI precision and recall, and introduce a 
measure for quantification in the following: Efficient privacy-accuracy trade-offs in LDP or CDP 
must reduce A’s susceptibility without sacrificing significant test accuracy for DO. In the follow- 
ing we define our measure w.r.t. MI precision. However, the definition is analogously applicable 
to MI recall, and will be used for MI precision and recall throughout this work. 


Let preCorig be A’s MI precision and accy,ig be DO’s original test accuracy, and let prece be 
A’s MI precision and acc¿ be DO’s resulting test accuracy after application of LDP or CDP. Also, 
let acCpase be the minimal test accuracy of 1/C and precpase be the minimal MI precision of 0.5 
(e.g., random guessing). For the calculation of y we will clip decreases below the baseline (i.e., set 
values below the baseline to the baseline) since these indicate an Adversary worse than random 
guessing. Normalizing yields mitigation efficiency @ as stated below. Equation (4.2) requires 
PYeCorig — PVCChase £ O and acCorig — ACChase F 0, i.e., the original model is vulnerable to MI and 
yields meaningful test accuracy. 


maximum decrease accuracy—actual decrease accuracy 
maximum decrease accuracy 


maximum decrease precision—actual decrease precision 
maximum decrease precision 
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(acCorig—ACChase)— (ACCorig—ACCe ) 
— ACCorig —ACChase (4 2) 
> (preCorig —PYéChase) —(PreCorig —prece) ` 
PY€Corig —PreCbase 


@ quantifies the relative loss in accuracy in its numerator and the relative gains in privacy in 
its denominator for a given privacy parameter €. Hence, @ presents the relative privacy-accuracy 
trade-off as a ratio which we seek to maximize. When the relative gain in privacy (lower MI 
precision or MI recall) exceeds the relative loss in accuracy @ will be > 1. In contrast, if the loss 
in test accuracy exceeds the gain in privacy @ will be < 1. 


4.3.2 Dataset 


Skewed Purchases We specifically crafted this dataset to mimic a situation for transfer learn- 
ing, i.e., the application of a trained model to novel data which is similar to the training data 
w.r.t. format but following a different distribution. This situation arises in a classification sce- 
nario where customer shopping carts shall be classified to customer groups, if for example not 
enough high-quality shopping cart data for a specific retailer are available yet. Thus, only few 
high-quality data (e.g., manually crafted examples) can be used for testing and large amounts 
of low quality data from potentially differing distributions for training (e.g., from other retail- 
ers). In effect the distribution between train and test data varies for this dataset. The dataset 
consists of 200,000 records with 600 features (e.g., products possibly contained in a shopping 
cart) and is available in four versions with C € {10,20,50,100} labels. Each vector x in the 
training dataset X is generated by using two independent random coins to sample a value from 
{0,1} per position i = 1,...,600. The first coin steers the probability Prix; = 1] for a fraction of 
600 positions per x. We refer to these positions as indicator bits (ind) which indicate products 
frequently purchased together. The second coin steers the probability Pr|x; = 1] for a fraction 
of 600 — (SP) positions per x. We refer to these positions as noise bits (noise) that introduce 
scatter in addition to ind. We let Pring[x; = 1] = 0.8 A Prnoise[xi = 1] = 0.2, Vx € Xirain, and 
Pring |x; = 1] = 0.8 A Prnoiselx; = 1] = 0.5 Ax € Xrest, 1 <i < 600. This dataset has a difference 
in information entropy between test and train data of ~ 0.3. The difference would be ~ 0, if there 
is no skew. Figure[4.4]depicts the MI precision over the different training dataset distributions and 
a fixed test distribution, and illustrates that datasets with varying train and test distributions are 
actually more vulnerable to MI and thus potentially require stronger privacy parameters. 


1.0 4 


0.54 


MI Precision 


0.0 y T T t 
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Figure 4.4: Skewed Purchases skew effect on MI precision 
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Dataset LDP composed e | Comment 
Skewed Purchases 4,800 — 60 600 x €; 


Table 4.2: Overview of LDP e. 


Skewed Purchases 
C 10 20 50 100 
Baseline | 107% | 107° | 10% | 1073 


gaa CDP 103 | 10 | 103 | 10> 

LDP 10-3 | 10-3 | 10-3 | 1073 

Batch Baseline 100 100 100 100 

CDP 100 | 100 | 100 | 100 
Size 


LDP 100 100 100 100 
Baseline | 200 200 200 200 
Epochs CDP 200 | 200 | 200 | 200 
LDP 200 200 200 200 


Table 4.3: Hyperparameters 


4.3.3 Experiment 


We perform a single experiment. The experiment compares LDP and CDP under MI precision and 
recall instead of privacy parameter €. The experiment is analyzed through three sets of figures. 
First, by plotting test accuracy and MI precision and recall over €, respectively. The three resulting 
graphs map € to MI precision and recall, and test accuracy. We present this information for CDP in 
Figure/4.5jand for LDP in Figure/4.6| Second, by comparing the achievable MI precision and recall 
over target model test accuracy in a scatterplot to identify strictly better privacy-accuracy trade- 
offs. We present this information for LDP and CDP in Figure [4.7] Third, by calculating Q (cf. 
Equation in Section [4.3.1) to identify efficient privacy parameters for DO w.r.t. mitigating 
MI precision. The corresponding figures are[4.5[d) for CDP and [4.6 d) for LDP. 

For all executions of the experiment CDP noise is sampled from a Gaussian distribution 
(cf. Definition |3) with o = noise multiplier z x clipping norm C. We evaluate increasing noise 
regimes by evaluating noise multipliers z € {2,4,6,8, 16} until model convergence and calculate 
the resulting € at a fixed 6 = L, We denote the non-private, original MI precision and DO test 
accuracy as original. For LDP we use the same hyperparameters as in the original training and 
evaluate randomized response as local randomizer. For each randomizer we state the individ- 
ual € per invocation (i.e., per anonymized value) and € per record (i.e., collection of dependent 
values). We apply randomized response to the dataset with a range of privacy parameter values 


e; € {0.1,1,3,5,8} that reflect varying retention probabilities. 


Skewed Purchases An effect which we observed in transfer learning practice lets us to further 
analyze differing distributions between train and test data. This effect is, for example, encountered 
when insufficient high-quality data for training is initially available and reference data that poten- 
tially follows a different distribution has to be acquired for first training. CDP results in strong 
privacy guarantees e € {3.5,1.6,1,0.8,0.4}. 

Figure states the changes in MI precision over privacy guarantees €. While for the 
simple classification tasks C € {10,20} MI precision remains at the original even over all e, for 
C = 50 MI precision drops close to the baseline at 0.53 already for € = 3.5. In contrast, € has 
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Figure 4.5: DO accuracy and privacy analysis on Skewed Purchases (error bars lie within most 
points) for CDP. 


only a small effect on MI recall as depicted in Figure The baseline is solely reached for 
C € {50,100} at e = 0.4. Figure [4.5(c)] presents target model accuracies over €. The decrease 
in test accuracy is comparatively stronger for C € {10,20} due to the low initial baseline for 
C € {10,20}. CDP comes indeed at a heavy cost on this dataset: DP-SGD weakens .A’s MI 
precision while significantly lowering DO’s test accuracy. 

We state MI Precision and recall for over e; for LDP in Figures [4.6(a) and |4.6(b)| While MI 
precision is not much affected for C Æ 100 until e; = 1, MI recall is solely affected by LDP at a 
strong €; = 0.1 and C € {10,20}. 

Figure [4.6(c)] indicates that DO’s test accuracy in LDP is robust to noise under randomized 
response especially for classification tasks for which n is much larger than the dimensionality / of 
the training data per class (e.g., C € {10,20}). For C € {10,20} randomized response solely affects 
DO’s test accuracy under a strong £; = 0.1 in contrast to C € {50,100} at e; = 5. For C € {10,20} 
we observe a regularization effect from randomized response which generalizes Danos towards 
eee aets Here, the test-train-gap narrows due to increasing test accuracy. Thus, the confidence val- 
ues also become similar and impede A’s attack model in distinguishing predictions from Dt%222 


target 


and D¢rget resulting in a decreasing MI recall. 


Again we provide scatterplots for C = 10 and 100 in Figure[4.7(a)|and A meaningful 
relative privacy-accuracy trade-off for MI precision and recall is only achieved under LDP for 
C = 10 and 20. For C = 10, £; = 0.1 LDP trumps £ = 3.5 CDP. Figure/4.5(d)jillustrates that almost 
all p < 1 for MI recall and precision under CDP. This observation supports our first impression 
that CDP impacts DO’s test accuracy stronger than A’s MI precision and recall. For LDP we 
observe high @ in Figure [4.6(d)| for C € {10,20} and strong e, < 1. q is = 50 for MI recall at 
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Figure 4.6: DO accuracy and privacy analysis on Skewed Purchases (error bars lie within most 
points) for LDP. 
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Figure 4.7: DO accuracy and privacy analysis on Skewed Purchases (error bars lie within most 
points) for LDP and CDP. 


e; = 0.1. However, these high efficiency values are mainly due to an increasing test accuracy over 


£i at small decreases in MI recall and precision. 


4.4 Quantifying Privacy Risks in Generative Models with MI 


In this section we introduce two novel MI attacks that can be used for both single and set MI. 
Since the attacks details are targeting generative models, we briefly describe VAEs and GANs in 
Section 4.4.1] The first attack, namely the Monte Carlo attack (Section [4.4.2) compares samples 
drawn from the model to either test or train records. Opposed to existing approaches, only very 
close samples are considered. Indeed, this distinguishes the attacks from previous approaches like 
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Attack Required Access Applicable Idea 

White-box Discriminator GANs Evaluate Discriminator 

Black-box Samples from Generative Model Generative Train auxiliary GAN on sam- 
Models ples & evaluate Discriminator 

Monte Carlo Samples from Generative Model Generative Monte Carlo approximation on 
Models close samples 

Reconstruction Attack VAE model VAEs VAE reconstructs training data 


more precisely 


Table 4.4: Comparison of Attacks 


the Euclidean attack and made the attacks effective. Furthermore, the Reconstruction 
attack (Section[4.4.3) which is optimized for VAEs is presented. A comparison of our attacks and 
state-of-the-art attacks is given in Table [4.4] An attack is fully specified by the function f (x). We 
evaluate the attacks in Section [4.4.4 


4.4.1 Generative Models 


Generative models are ML models that are trained to learn the joint probability distribution p(X, Y) 
of features X and labels Y of training data. In this work we apply two decoder based models relying 
on neural networks, namely Generative Adversarial Networks (GANS) and Variational 
Autoencoders (VAEs) [KW13]. Note, however, that our Monte Carlo attack is applicable to all 
generative models from which one can draw samples. The reconstruction attack specifically targets 
VAESs. 


Generative Adversarial Networks 


A GAN consists of two competing models, a generator G and a discriminator D, which are trained 
in an adversarial manner (i.e., compete against each other). We describe the approach in detail 


referring to Figure|4.8] 


Fra] — Cr) [od 
= r 


Figure 4.8: Architecture of a Generative Adversarial Network (GAN). 


To generate artificial data, a prior z is sampled from a prior distribution Proise (€.g., Gaussian) 
and fed as input into the generator G. The task of the discriminator D is to output the probability 
that generated samples stem either from the training data or G. However, G tries to fool D by 
generating samples that D misclassifies. Hence, the outputs G(z) should look similar to the training 
data x (i.e. records sampled from Pdata). This is expressed as a two-player zero-sum game via the 
following objective function: 


mini max E x~ paata [log D(x)| le E z~ pnoise [log( 1 > D(G(z) ) )] > 
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Gradients are computed for G and D during training, and usually, after already a few steps of 
training G produces realistic outputs. A conditional generative model is obtained by providing a 
condition c (e.g., a class label) as an input both to the generator and the discriminator [GPM* 14]. 


Variational Autoencoders 


VAEs consist of two networks - an encoder E and a decoder D. During training each 
record x is given to the encoder which outputs the mean E, (x) and variance Ey (x) of a Gaussian 
distribution. A latent variable z is sampled from this distribution N (Ey (x), Ex (x)) and fed into the 
decoder D. The reconstruction D(z) should be close to the training data record x. 

During training two terms need to be minimized. First, the reconstruction error ||D(z) — x|]. 
Second KL(N(Ey (x), Ex (x))|IN(0,1)), the Kullback-Leibler divergence between the distribution 
of the latent variables z and the unit Gaussian. The second term prevents the network from only 
memorizing certain latent variables because the distribution should be similar to the unit Gaussian. 
In practice, both the encoder £ and the decoder D are neural networks. Kingma et al. pro- 
vide details on how to train those networks given the training objective with the reparametrization 
trick. Moreover, they motivate the training objective as a lower bound on the log-likelihood. Sam- 
pling from the VAE is achieved by sampling a latent variable z ~ N(0,1) and passing z through 
the decoder network D. The outputs of the decoder D(z) then serve as samples. Like for GANs, a 
conditional variant is obtained by providing a condition c as input to the decoder and the encoder. 


4.4.2 Monte Carlo Attack 


In the following section we introduce the first attack which is applicable to all generative models. 
The intuition behind the Monte Carlo attack is that the generator G overfits if it tends to output 
datasets close to the provided training data. Formally, let U¿(x) denote the £-neighborhood of 
x defined as U¿(x) = [x"|d(x,x') < ê} with respect to some distance d. If a sample g of the 
generative model G is likely to be close to a record x the probability P(g € Ug(x)) is increased. It 
can be rewritten as 


P(g € Us (x)) = E g~ pgenerator (1gcu;(x)) 
and approximated via Monte Carlo integration [Owe13] 


x 1 
fuc—e(x) = = L RAGE (4.3) 
where g1,...,gn are samples from Peenerator- Note that samples g; of the generator G are ignored if 


their distance to the training data record x is higher than 2. In this attack, the estimation fyc_e (x) 
plays the role of the function f (x) attaining higher values for training data records. 

An alternative is provided by incorporating the exact distances d(z;,x) between samples g1,...,2n 
and training data x, and computing 


er generator (—1, cu; (x) log (d(g,x) + 5)) 


where a small 6 is chosen to clip off large values ("avoid log(0)") if the distance is zero. The loga- 
rithm is to ensure that outliers do not affect the results too much. The Monte Carlo approximation 
is then given by 


A 1 n 
fuc—a(x) = z Y lyceu logd(gix) . (4.4) 
i=l 
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Here, the estimation fyc_,(x) plays the role of the function f(x) used to conduct the attack types 
presented above. 

In the case of GANs and VAEs one obtains g; ~ Pgenerator by Sampling from zi ~ Pnoise and 
computing g; = G(z;) and g; = D(z;), respectively. Note that only a sufficiently large amount 
of samples has to be provided and no additional information is required. Of course, both attack 
variants depend on the specification of the distance d(-,-). See below for details. 

A further alternative to the attacks discussed could be realized using a Kernel Density Estima- 
tor (KDE) [[Par62]. In the following we briefly compare the Monte Carlo attack with this metric. 
An estimation of the likelihood f (x) of a data point x using KDE is given by 


, 1 > 
FDE(x) = aif ŁK E ) l (4.5) 
i=l 


where K is typically the Gaussian kernel and h denotes the bandwidth. If this likelihood fkpg (x) is 
significantly higher for training data than for test data the model fails to generalize. Likewise the 
approximate likelihood values fxpz(x) can be used as the function f(x) to conduct the single and 
set MI attack types. However, this attack variation did not perform better than random guessing 
and is therefore not considered in our evaluation section. 

Note that KDE can indeed be interpreted as a special case of the proposed distance based 
method (4.4), where 


d(x,g1) = 1/exp(h® -K((x—gi)/h")), and 
ê= max d(x,g;). 
i=1,...,n 

As KDE does not perform well for MI against generative models this stresses that choosing the 
right distance function seems to be key. In contrast to KDE, our attacks exclusively consider 
samples significantly close to training data x. To fully specify the Monte Carlo attacks concrete 
distance measures and heuristics for choosing ê are required. We describe our approach for this in 
the next two subsections. 


Distance Measures 


Both Monte Carlo (MC) attack variants require a distance function d(-,-) and the distance plays 
an important role for the success of the MI attack. Therefore, a distance metric suited for the 
specific data under consideration has to be chosen. For neural networks, image recognition has 
become a key task and consequently, we formulate distance metrics for image data in the following 
paragraphs. 

Principal Components Analysis. Images are initially represented as a vector of their pixel in- 
tensities. A principal component analysis (PCA) is then applied to all vectors in the test dataset. 
The top 40 components are kept while all other components are discarded. When computing the 
distance between two new images the PCA transformation is first applied to their vectors of pixel 
intensities. The Euclidean distance of the two resulting vectors with 40 components each is then 
defined as the distance of the images. 

Histogram of Oriented Gradients. Histogram of Oriented Gradients (HOG) is acom- 
puter vision algorithm enabling the computation of feature vectors for images. First, the image is 
separated into cells. Second, the occurrences of gradient orientations in the cells are counted and 
a histogram is computed. The histograms are normalized block-wise and concatenated to obtain 
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a feature vector. Again the Euclidean distance of these vectors is used as image distance. This 
approach was successfully used by Ebrahimzadeh et al. for an MNIST data classifier. 
Color Histogram. According to the intensities in the three color channels, the pixels are sorted 
into bins. For the pixels of one image, this results in a color histogram (CHIST) which can be 
represented as a feature vector. The Euclidean distance of these vectors is defined as the image 
distance. 


Heuristics for é 


For the attack all pairwise distances d(x;,g;) of the records x; and samples g; need to be com- 
puted. Samples with distances greater than & to the training data records are ignored. Hence, an 
appropriate choice of ê is crucial for the success of the attack. We thus formulate two heuristics in 
the following. 

Percentile Heuristic. The first heuristic is to use a fixed percentile of all pairwise distances 
d(xj,g;) as ê. By choosing the 0.1% percentile of the distances as ê we can ensure that the 
corresponding samples in an €-neighborhood are sufficiently close. Note that the MC-£ and MC-d 
approaches are not necessarily equivalent if this heuristic is employed. 

Median Heuristic. The second heuristic avoids the need to choose an additional parameter such 
as the percentile value. Again, the idea is to exploit the measured distances in the Monte Carlo 
computation. In this approach, the median of the minimum distance to each record x; for all the 
generated samples g; is chosen: 


£ — di i i ; á 4. 
pm (,min, 26 8) (4.6) 
If ê is chosen according to the median heuristic the results of MC-£ and MC-d are equivalent 
in both the single and set MI types as there are always exactly M records with fuc_e(x) >0 
and Íuc-a(xi) > 0. A comparison of the MC attack variants is provided in the evaluation in 
Section [4.4.4 


4.4.3 Reconstruction Attack 


The reconstruction attack is solely applicable to VAEs. During training, reconstructions D(z) 
close to the current training data record x are rewarded. Hence, for training data more precise 
reconstructions of the VAE can be expected. However, the outputs D(z) are not deterministic. 
They depend on the latent variable z which is sampled from the distribution N (Ey (x), Ex (x)) whose 
parameters are the output of the encoder network E. Hence, we repeat this process n times and set 


2 | ee 
Freclx) === Y 1D (ci) — xl (4.7) 
i=l 


where z; (i= 1,...,n) are samples from the distribution N(E, (x), £s(x)). This term is frequently 
used in practice as part of the loss function of VAEs. One of the contributions of this work is 
to apply this loss to the problem of membership inference. Specifically, the function frec (x) is 
applied in the attack types as the discriminating function f (x). This induces the Reconstruction 
attack. Note that this attack considers a strong adversary A with access to the VAE model. 
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4.4.4 Evaluation 


The two MI attacks formulated in this work are evaluated in comparison to the white and black- 
box MI attacks of Hayes et al. against generative models trained on MNIST in Sec- 
tion|4.4.4 

The white box attack is solely applicable to GANs and requires access to the discriminator D. 
The intuition behind this attack is that D tends to attain higher outputs D(x) if the record x was part 
of the training data due to the indirect reward during training. Specifically, the discriminator D 
plays the role of the function f(x) in this attack. The black box attack overcomes the limitation of 
the white box attack in that it requires no access to D. It is therefore not solely applicable to GANs. 
For the black box attack, an auxiliary GAN is trained with samples g1,..., 8n from the target model 
and the discriminator D’ of this newly trained model is used in a white box manner. Consequently, 
records with increased D'(x) are considered part of the training data. Hence, the discriminator 
D' serves as the function f (x) attaining higher values for training than for test data. Though not 
explicitly tested in the original paper, the black-box attack is also applicable to other generative 
models such as VAEs since it only requires access to samples g1,...,2,. In experiments, the white 
box attack performed significantly better than the black box attack [HMDD19]. 

In general, our MC attacks outperformed state of the art, i.e. the white box attack of Hayes 
[HMDD19], for MNIST which is are considered a very hard dataset to attack with membership 
inference due to its simplicity. Since it is an upper bound for the accuracy, also the black box attack 
is outperformed. Since several parameters have to be chosen before the attacks are applied a study 
of the effect of these parameters is presented in Section [4.4.4] Moreover, additional experiments 
on VAEs trained on the MNIST dataset are provided in Sections[4.4.4]and[4.4.4] 


Setup and Dataset 


We evaluated the attacks of Hayes et al. [HMDD19], the Monte Carlo and the Reconstruction 
attacks for differing 10% subsets of the MNIST dataset. The simple nature of MNIST has proven 
to result in low MI precision in previous work. The remaining 90% are used as test set in both the 
single and set MI attack type. To ensure a fair comparison we executed all experiments repeatedly 
and report standard deviations. Neural networks are implemented with tensorflow [ABC 16], and 
for the HOG and PCA computations, the python libraries scikit-image and scikit-learn 
are used. Experiments were run on Amazon Web Services p2.xlarge (GAN) and c5.2xlarge (VAE) 
instances. 

We first describe the datasets and models used before analyzing the parameters of the attacks. 


MNIST MNIST is a standard dataset in machine learning and computer vision consisting of 
70,000 labeled handwritten digits which are separated into 60, 000 training and 10,000 test records|"] 
Each digit is a 28 x 28 grayscale image. In all subsequent datasets only a 10% subset of the train- 
ing images is used for training to provoke overfitting. The remaining 90% of the training data are 
used as test data to compute the accuracies of the attacks. The actual MNIST test data are only 
used to define the PCA transformation for the PCA based distance. This ensures that the distance 
is not influenced by the specific choice of the training data or the remaining 90%. Attacks are 
performed against two state of the art generative models, namely GANSs (cf. Section [4.4. 1) and 
VAEs (cf. Section [4.4.1). For the GAN we employ the widely used deep convolutional genera- 


“http://yann.lecun.com/exdb/mnist/ 
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Heuristic/Percentile HOG-based distance 
GAN Monte Carlo-d GAN Monte Carlo-€ VAE Monte Carlo-d VAE Monte Carlo-é 
Median 63.76+3.83 63.76+3.83 83.50+2.43 83.50+2.43 
0.01% 63.76+3.68 66.1143.70 81.00+2.59 82.25+2.50 
0.10% 63.76+3.71 62.08+3.65 74.50+2.90 71.75+2.98 
1.00% 60.07+3.84 59.73+3.86 59.50+3.24 54.00+3.29 


Table 4.5: Set accuracies for R with HOG-based distance depending on ê values 


Heuristic/Percentile PCA-based distance 
GAN Monte Carlo-d GAN Monte Carlo-€ VAE Monte Carlo-d VAE Monte Carlo-é 
Median 74.84+3.25 74.84+3.25 99.75+0.25 99.75+0.25 
0.01% 74.8443.31 71.94+3.40 95.50+1.34 91.75+1.80 
0.10% 64.84+3.69 59.68+3.78 94.75+1.52 95.50+ 1.43 
1.00% 471.42+3.77 51.61+3.76 60.75+3.21 58.50+3.29 


Table 4.6: Set accuracies for R with PCA-based distance depending on ê values 


tive adversarial network (DCGAN) architecture which aims to improve both stability 
and quality of GANs for image generation. This network relies on convolutional neural networks 
(CNN) which are state of the art for many computer vision tasks. We trained the DCGAN for 500 
epochs (i.e., until convergence) with a mini batch size of 128) For the VAE we apply a standard 
architecture] with 90% Dropout and a mini batch size of 128. Due to the different convergence 
behavior, the VAE is only trained for 300 epochs. For both models, GAN and VAE, we utilize the 
conditional variant s.t. we can control which digit is generated. 


Attack Parameters 


The effects of the attack parameters are analyzed in the following. Specifically, for the MC attacks 
the effect of the heuristic for setting ê and the number of samples n for the Monte Carlo integration 
are studied. We expect these to be similar for both GANs and VAEs. Hence, the analysis is 
restricted to the case of VAEs. For the Reconstruction attack, we study how the number of samples 
n for the reconstruction error estimation affects the accuracy. 


Monte Carlo Attack 


The set MI accuracies against VAEs trained on MNIST for different choices of ê are reported in 
Table [4.5] and [4.6] for R. Note that the results of the MC-£ and MC-d attacks do not differ sig- 
nificantly. This suggests that the main contribution is the introduction of ê effectively ignoring 
samples which are further than € away from the training records. In the case of the median heuris- 
tic, the two MC attack variants yield equivalent performances as expected. However, the median 
heuristic outperforms the percentile heuristic. 

Besides the heuristic for ê, a sample size for the Monte Carlo approximation has to be chosen. 
Hence, we also analyze the performance of the MC-é attack depending on the sample size. Again, 
the MC-é attack is equivalent to the MC-d attack in the case of the median heuristic. The single 
and set accuracies are stated in Figure[4.9] for A and R, respectively. In general, higher percentile 
values ignore fewer samples since ê is increased. A smaller sample size is required to achieve op- 


>We used https://github.com/yihui-he/GAN-MNIST as a starting point. 
We used https://github.com/hwalsuklee/tensorflow-mnist-VAE as a starting point. 
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timal accuracy for these percentiles. However, the accuracy of higher percentile values is inferior 
to the ones of lower percentile values. 

For example, the 10% percentile attack already reaches its optimum in the minimal case of 
3,000 samples and the 1% percentile saturates at 10+ samples. The 0.1% percentile approach is 
gaining higher accuracies and does not level off at 10° samples. It is noticeable that the median 
heuristic always outperforms the other heuristics. We conjecture this heuristic to level off at a 
higher sample size. However, in practice there is a trade-off between computational effort and 
accuracy of the attack. To study the effect 20 experiments for the median heuristic with 10” 


samples each are conducted, achieving a single record MI accuracy of 59.80 + 3.50% for A and 


a set MI accuracy of 100.00+ 0.00% for R. In the subsequent experiments, we always use 10° 
samples for the Monte Carlo simulations. 

The median heuristic is superior to the percentile heuristic for all sample sizes. Moreover, no 
parameter like the percentile is required. Thus, in all subsequent experiments we apply the median 
heuristic for which the MC-£ and MC-d attacks are equivalent. We refer to these equivalent 
approaches simply as MC attack. 
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Figure 4.9: MC attack accuracy (differing scales) on MNIST with PCA based distance against 
VAEs depending on sample size. 


Reconstruction Attack 


We also study the effect of the sample size n to approximate the reconstruction error 
A 12 
freck) == Y |D) — >]. (4.8) 
i=l 


In preliminary experiments even small sample sizes of n = 300 yielded good accuracies. This 
suggests that the estimator frec (x) is accurate enough for small n values. To ensure optimal results 
we conduct the subsequent experiments with n = 10° for the reconstruction attack against a VAE 
trained on MNIST. 


Results on MNIST 


Following the description of the Monte Carlo estimators of Section we computed the dis- 
tance of every record x; to each of the samples g1,...,2,. However, the label per record is known 
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and as we used the conditional variants we could control which digit was generated by the target 
model under consideration. Thus, we only used samples representing the same digit as the current 
record x;. 

For the MC é-values we either used a certain percentile of all measured distances or a dynamic 
£ based on the median heuristic (cf. Section[4.4.2). 


ay $ IAN A7S4 1036 71/04/1746 90690 
PRISVOIGIILSSSO 1597349665407 4 
MA AA AS 013134727121 17 
5673628114525 423512444 635560 
S2458 30678 6044 41957 F7VYF7?746430 
SY67T2O%E7NFOO04 70221734171627 
AR5830344395/247 E€E47376/36731 4137 
PA JEANS 6960549099/4467 
QAGYERTOZ/IFOVE 39+44492547674 
717/4397800Y8 3 OS85S66S7810/6Y4 
EAS AS ¡DC E A E ES 
3/07033%4 P9yYy332 ISLO YE6SYUS45 
3/00706/3240291 IFAI d TIO 1518 
0703314374922 ¡MERMA AAA A 0% 
(a) GAN after 500 epochs (b) VAE after 300 epochs 


Figure 4.10: Generated digits of the GAN and VAE after training on the MNIST dataset. 


With a relatively simple dataset such as MNIST it is very hard to find the subtle replications of 
the training data if the model overfits. In more feature-rich image datasets we assume it should be 
easier to identify overfitting as even a visual inspection could suffice to recognize the replication of 
training images. In Figure[4.10] generated digits of both models trained with a random 10% subset 
of MNIST can be seen. It appears to be nearly impossible to infer membership of training data 
by visual inspection as there are no remarkable replicated characteristics such as specific colors, 
elements in fore- or background etc. Note that the samples of both the GAN and the VAE are 
visually appealing. 

Having analyzed the parameters of our proposed attacks, we now compare their accuracies 
with the recent white-box and black-box attacks of [HMDD19]. To stabilize the results 10 different 
10% subsets of the MNIST data are chosen as training data for the GAN and VAE models. For 
every subset 10 single and set MI attacks are conducted with M = 100. While we apply the white- 
box attack against the GAN, we are limited to the black-box attack in case of the VAE as the latter 
model does not feature a discriminator. In order to test the black-box attack, a new GAN is trained 
with 10° samples from the target VAE. 

For the Monte Carlo estimator fyc we use the PCA and HOG based distances introduced in 
Section The CHIST distance is not applicable since MNIST solely consists of grayscale 
images. As described in the previous section, we use n = 10° samples and the median heuristic. 
The resulting accuracies are depicted in Figure The dotted horizontal baseline at 50% is 
the average success rate of random guessing. In general, the accuracies of single MI for A are 
significantly lower than those of set MI for R. Furthermore, all attacks are much more successful 
if applied against VAEs instead of GANs. This suggests that in general there is less overfitting in 
GANs. This observation is consistent with the Annealed Importance Sampling measurements by 


Wu et al. [WBSG16]. 
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Figure 4.11: Average accuracy (differing scales) of the attacks on MNIST in the single and set 
experiments with standard deviation. 


The black-box and white-box attack do not perform significantly better than the baseline in 
both experiments. The MC attack clearly outperforms these attacks. When used with PCA dis- 
tance our MC attack can even infer set membership with nearly 100% accuracy against a VAE. For 
the GAN the accuracy is still about 75%. In general, accuracies are inferior if the HOG distance is 
used. As a side fact, the Monte Carlo based attacks with PCA distance take ~ 7 minutes each on 
a p2.xlarge instance on AWS. Currently, at the cost of 0.90 US $ per hour, the attacks only cause 
minor costs. The specialized Reconstruction attack is superior to the MC attack in the case of the 
VAE yielding = 70% and 100% in the single and set MI attack, respectively. The high accuracies 
of the attacks we proposed make them especially attractive for the regulatory use case defined in 
Section [4.2.3 


Effect of Subset Choice 


It is unclear how the specific choice of the MNIST 10% subset influences the accuracy of the 
MC attack. In Figure [4.12] the average MC attack performance with PCA distance against VAEs 
trained on different subsets are plotted. Attack performances seem independent of the specific 
subset. We also conduct an F-test to evaluate whether the single accuracy means of the four VAEs 
are different at 10% samples resulting in a p-value = 0.64. Hence, the hypothesis that the means 
are equal can be accepted with high probability, i.e. the choice of the subset does not significantly 
influence the attack results. We conclude that the accuracy depends on the size of the training data 
rather than its specific members. 

We remark that in the experiment setups M = 100 samples of the 10% subset of the training 
data and 100 samples of the remaining 90% training data are chosen. The set MI experiments yield 
high accuracies. Therefore, if a regulator suspects that some dataset was used for training a model 
this can be recognized with the novel attacks even though other data might have been part of the 
training data as well. This is an analogous case to the experiment described. Though of course 


$A MOSAICrOWN Deliverable D5.2 


52 Quantifying Privacy for Machine Learning with Membership Inference 


—— Subset 1 
= 59] —e— Subset 2 984 
S —— Subset 3 > 
2 58 Subset 4 S 964 
El 3B 
A 57 > 94 
> 1S} 
Q = 
= 56 3 924 Subset 1 
2 5 < —*— Subset 2 
90; —+— Subset 3 
54 8s. —e— Subset 4 
10% 10° 10° 104 10° 10° 
Samples Monte Carlo Samples Monte Carlo 
(a) Adversarial actor: Single MI (b) Regulartory actor: Set MI 


Figure 4.12: MC attack accuracy (differing scales) on MNIST with PCA distance depending on 
sample size for four different training subsets. 


Size Monte Carlo (PCA dist.) Reconstruction attack 
Single Set Single Set 
40%  50.79+0.27 57.50+3.24 57.35+0.37 98.50+1.11 
20%  57.05+0.32 94.75+1.39  62.23+0.38 100.00+0.00 
10%  59.93+0.26 99.75+0.25 70.09+0.37 100.00+0.00 


Table 4.7: Accuracies depending on MNIST training data size 


more training data was used, we focus on 100 samples. It is very likely that the inappropriately 
used data are not the only data used to train the model. Hence, the practicability of the MC attack 
is increased since the regulator does not need to know all the training data to prove that a certain 
subset was used. 


Effect of Training Data Size and Regularization — Mitigations 


We also investigate how the size of the training dataset influences the success of the attacks for the 
MNIST dataset. For this, five VAEs are trained with 20 experiments each since the effect should 
be similar for GANs. The results for the MC attack and Reconstruction attack are depicted in 
Table [4.7] When using 40% of the training data instead of the usual 10% the accuracy shrinks 
from 60% to 51% for single MI and from nearly 100% to only about 58% for set MI in the case of 
the MC attack. As expected, for 20% the effects are less significant. Clearly, more training data 
would further reduce the effectiveness of the attacks. However, in the case of the Reconstruction 
attack, the effects are less significant. Even if 40% are used the set accuracy is still about 100% 
meaning that the Reconstruction attack is more robust. 

In general, the performance declines suggest that generative models make use of the additional 
information provided by additional training data. Similar effects were observed before in the case 
of the white-box attack [HMDD19}. 

However, often in practice the amount of training data is a bottleneck for training generative 
models. In consequence, one could use regularization methods to improve the generalization such 
as dropout (SHK* 14]. In the case of dropout, certain neurons are switched off during training 
with given probability to increase the resistance of the network. In the standard case we already 
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Rate Monte Carlo (PCA dist.) Reconstruction attack 
Single Set Single Set 
50%  51.45+0.26 64.75+3.19 53.77+0.34 86.00+3.18 
70%  53.17+0.29 78.50+2.71  58.31+0.40 97.00+1.56 
90%  59.93+0.26 99.75+0.25  70.09+0.37 100.00+0.00 


Table 4.8: Accuracies depending on MNIST Dropout Keep Rates 


use dropout with a keep probability of 90% both in the encoder and decoder of the VAE. We also 
conduct experiments for the MC and Reconstruction attack at lower keep rates of 70% and 50%. 
The accuracy in the set MI type decreases to 79% at a keep probability of 70% and to 65% at an 
even reduced keep probability of 50% for the MC attack. Again, the effects are less significant 
for the Reconstruction attack still yielding = 86% set MI accuracy for a 50% keep rate. Detailed 
results are reported in Table The results indicate that dropout can indeed be used in practice 
to mitigate the proposed MI attacks. This can also be observed in the case of the white-box 
attack [HMDD19]. However, a lower keep probability also causes the generated images to get 
increasingly blurry as depicted in Figure [4.13] Hence, there is an inherent trade-off between high 
image quality and low MI attack accuracies. 
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Figure 4.13: Generated samples of the trained models. 
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5. Conclusions 


We have observed that anonymization can be a valuable tool to mitigate disclosure risks for per- 
sonal or sensitive data. However, proper anonymization is hard due to a number of inherent chal- 
lenges. Within MOSAICrOWN we provide and advance anonymization methods with quantitative 
privacy parameters that allow to balance privacy and utility. These methods provide different trade- 
offs, as detailed in this Deliverable, and can be used in combination for data collections (e.g., by 
applying them on different parts of the data). 

The first line of such methods is represented by the generalization based approaches (k- 
anonymity, ¢-diversity, t-closeness) for syntactic privacy. While these approaches have inter- 
pretable privacy guarantees and have seen wide adoption for anonymized microdata release, they 
consider specific aspects of the problem and remain vulnerable to some attacks, and their applica- 
bility to high-dimensional data can be limited. 

The second line of anonymization methods is represented by perturbation based algorithms 
(€-differential privacy/CDP, (€,6)-differential privacy, €-local differential privacy/LDP) for se- 
mantic privacy. While these comparatively young algorithms have seen wide adoption for per- 
turbation of statistical functions, their mathematically strict privacy guarantees are comparatively 
hard to interpret, which we aim to alleviate. 

We paid special attention to privacy in the context of machine learning. To support data own- 
ers this document used membership inference threat models to quantify privacy violations and 
privacy threats in deep learning with the goal of identifying privacy interpretation techniques for 
data marketplaces. We outlined two general techniques for data sanitization in deep learning: local 
differential privacy (LDP) and (central) differential privacy (CDP). LDP is suited for anonymiza- 
tion of microdata training records for deep learning. In contrast, CDP allows to anonymize the 
deep learning optimization function during training (i.e., anonymized macrodata release). Our ini- 
tial experiment shows that data scientists should compare the privacy-accuracy trade-off for LDP 
and CDP per dataset, and consider the relative privacy-accuracy trade-off for LDP and CDP as 
the ratio of losses in accuracy and privacy over privacy parameters €. The choice of either dif- 
ferential privacy technique depends on whether the party which is training the machine learning 
model is trusted (CDP) or not (LDP) and what accuracy one wants to achieve (i.e., CDP for high 
accuracy with low €). The scope of MOSAICrOWN does not rule out trusted parties (or hybrid 
models where cryptographic tools replace such a party), and thus supports both LDP and CDP. 
Furthermore, we formulated and evaluated attacks (Monte Carlo attack, Reconstruction attack) to 
evaluate both overfitting and information leakage of generative models. 
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